‘ 


OF MICHIGAN 


rc 


FED 5 1954 


CANADIAN = “tarary< 
OURNAL OF MATHEMATICS 


Journal Canadien de Mathématiques 


VOL. VI- NO. 1 
1954 


Points and spaces L. E. J. Brouwer 
Note on the gamma function B. van der Pol 
The first factor of a cyclic field Leonard Carlitz 


Core-consistency and total inclusion 
G. G. Lorentz and A. Robinson 


Balanced incomplete block designs 
Marshall Hall Jr. and W. S. Connor 


Completeness of order statistics D. A. S. Fraser 
Scale and location parameters D. A. S. Fraser 
Unitary transformations B. E. Mitchell 
Geodesic solutions of differential equations G. F. D. Duff 
Chromatic polynomials W. T. Tutte 


Curves invariant under cyclic involutions 
W. R. Hutcherson and S$. T. Gormsen 


Parallel curves G. P. Henderson 
On sums of vectors F. A. Behrend 


On a theorem of Osima and Nagao 
J. S. Frame and G. de B. Robinson 


A construction for Wythoffian polytopes G. C. Shephard 
On the symmetries of spherical harmonics Burnett Meyer 
Sur certains sous-espaces vectoriels de L” A. Grothendieck 


Published for 
THE CANADIAN MATHEMATICAL CONGRESS 
by the 
University of Toronto Press 








EDITORIAL BOARD 


H. S$. M. Coxeter, A. Gauthier, R.D. James, R. L. Jeffery, 
G. de B. Robinson, H. Zassenhaus 


with the co-operation of 


R. Brauer, L. E. J. Brouwer, H. Cartan, D.B. DeLury, I. Halperin 
L. Infeld, S. MacLane, M.H. A. Newman, G. Pall, B. Segre, 
J. L. Synge, W. J. Webber 


The chief languages of the Journal are English and French. 


Manuscripts for publication in the Journal should be sent to the 
Editor-in-Chief, H. S. M. Coxeter, University of Toronto. Everything 
possible should be done to lighten the task of the reader; the notation 
and reference system should be carefully thought out. Every paper 
should contain an introduction summarizing the results as far as possible 
in such a way as to be understood by the non-expert. 


All other correspondence should be addressed to the Managing 
Editor, G. de B. Robinson, University of Toronto. 


The Journal is published quarterly. Subscriptions should be sent 
to the Managing Editor. The price per volume of four numbers is 
$6.00. This is reduced to $3.00 for individual members of the 
following Societies: 

Canadian Mathematical Congress 
American Mathematical Society 
Mathematical Association of America 
London Mathematical Society 

Société Mathématique de France 
Scandinavian Mathematical Societies 


The Canadian Mathematical Congress gratefully acknowledges the 
assistance of the following towards the cost of publishing this Journal: 


University of British Columbia 


Carleton College Ecole Polytechnique 
Université Laval Loyola College 
University of Manitoba McGill University 
McMaster University Université de Montréal 
Queen’s University Royal Military College 
St. Mary’s University University of Toronto 

National Research Council of Canada 

and the 


American Mathematical Society 


AUTHORIZED AS SECOND CLASS MAIL, POST OFFICE DEPARTMENT, OTTAWA 











L. E. J. BROUWER 


VISITING LECTURER, SUMMER SEMINAR, KINGSTON, 1953 











POINTS AND SPACES 
L. E. J. BROUWER 


1. The gradual disengagement of mathematics from logic. Beginning 
with a historical review of the development of mathematical thought, we have 
to consider successively (cf. 10, pp. 139-140): 

(1) The observational period. For some familiar regularities of (outer or inner) 
experience of time and space, which, to any attainable degree of approximation, 
seemed invariable, absolute and sure invariability was postulated. These regu- 
larities were called axioms and were put into language. Thereupon extensive 
systems of properties were developed from the linguistic substratum of the 
axioms by means of reasoning, guided by experience but linguistically following 
and using the principles of classical logic. This logic was considered autonomous, 
and mathematics was considered more or less dependent on logic. 

(2) The revolution in science of space. In the course of the 19th and the be- 
ginning of the 20th century, on the one hand geometry was gradually meta- 
morphosed into a chapter of the science of numbers, and on the other hand 
Euclidean three-dimensional geometry lost its privileged character since a 
great number of other geometries originating from logical speculations, with 
properties distinct from the traditional but no less beautiful, found an arith- 
metical representation likewise. 

(3) The old formalist school. Encouraged by the important part which had 
been played in the above metamorphosis of geometry by the Jogico-linguistic 
method, the old formalist school merged logic and mathematics into a single 
linguistic science, operating on meaningless words or symbols by means of 
logical rules, thus divesting logic and mathematics of their difference in character 
as well as of their autonomy. 

(4) The pre-intuitionist school, by which autonomy and apriority were re- 
established for logic and established for the major part of ‘‘separable’’ mathe- 
matics. For the continuum however, this school on some occasions seems to 
have contented itself with an ever-unfinished and ever-denumerable set of 
real numbers which can never have a measure positively different from zero; 
on other occasions it seems to have stuffed the continuum with elements pro- 
viding measure by means of some logical axiom. In both cases, in its further 
development of mathematics, it has unreservedly applied classical logic. So, 
logic and an introductory part of mathematics were autonomous here. The 
rest of mathematics was dependent on them. 

(5) The new formalist school, by which autonomy and apriority were postulated 
for mathematics of the second order, i.e., for scientific consideration of the symbols 


Received September 1, 1953. Lectures presented at the Seminar of the Canadian Mathe 
matical Congress at Kingston, Ont., Aug. 10-31, 1953. 











2 L. E. J. BROUWER 


occurring in purified mathematical language, and of the rules of manipulating 
these symbols. This scientific consideration of language, later on called meta- 
mathematics, although using complete induction, apriorizes much less than 
pre-intuitionism. What it seems to have overlooked is that between perfection 
of mathematical language and perfection of mathematics proper, no clear 
connection can be seen. 

(6) The intervention of intuitionism by two acts of which the first seems 
necessarily to lead to destructive and sterilizing consequences, whereas the 
second yields ample possibilities for recovery and new developments. 

The first act of intuitionism completely separates mathematics from mathe- 
matical language, in particular from the phenomena of language which are 
described by theoretical logic. It recognizes that mathematics is a languageless 
activity of the mind having its origin in the basic phenomenon of the perception 
of a move of time, which is the falling apart of a life moment into two distinct 
things, one of which gives way to the other, but is retained by memory. If the 
two-ity thus born is divested of all quality, there remains the common sub- 
stratum of all two-ities, the mental creation of the empty two-ity. This empty 
two-ity and the two unities of which it is composed, constitute the basic mathe- 
matical systems. And the basic operation of mathematical construction is the 
mental creation of the two-ity of two mathematical systems previously acquired, 
and the consideration of this two-ity as a new mathematical system. 

It is introspectively realized how this basic operation, continually displaying 
unaltered retention by memory, successively generates each natural number, 
the infinitely proceeding sequence of the natural numbers, arbitrary finite 
sequences and infinitely proceeding sequences of mathematical systems pre- 
viously acquired, finally a continually extending stock of mathematical systems 
corresponding to ‘“‘separable’”’ systems of classical mathematics. 

The second act of intuitionism recognizes the possibility of generating new 
mathematical entities: 

First, in the form of infinitely proceeding sequences whose terms are chosen 
more or less freely from mathematical entities previously acquired; in such a way 
that the freedom existing perhaps at the first choice may be irrevocably sub- 
jected, again and again, to progressive restrictions at subsequent choices, while 
all these restricting interventions, as well as the choices themselves, may, at 
any stage, be made to depend on possible future mathematical experiences of 
the creating subject; 

Secondly, in the form of mathematical species, i.e., properties supposable for 
mathematical entities previously acquired, and satisfying the condition that, if 
they hold for a certain mathematical entity, they also hold for all mathematical 
entities which have been defined to be equal to it, equality having to be sym- 
metric, reflexive, and transitive, and the empty two-ity being forbidden to be 
equalized to an empty unity. Mathematical entities for which the property in 
question holds, are called elements of the corresponding species. 

In the edifice of mathematical thought based on the first and second act 





POINTS AND SPACES 3 


of intuitionism, language plays no other part than that of an efficient, but 
never infallible or exact, technique for memorizing mathematical constructions, 
and for suggesting them to others; so that the wording of a mathematical 
theorem has no sense unless it indicates the construction either of an actual 
mathematical entity or of an incompatibility (e.g., the identity of the empty 
two-ity with an empty unity) out of some constructional condition imposed 
on a hypothetical mathematical system. So that mathematical language, in 
particular logic, can never by itself create new mathematical entities, nor deduce 
a mathematical state of things. 

However, notwithstanding this rejection of classical logic as an instrument 
to discover mathematical truths, intuitionist mathematics has its general 
introspective theory of mathematical assertions, a theory which with some right 
may be called intuitionist mathematical logic, and to which belongs a theory of 
the principle of the excluded third. 

In intuitionism this principle is also called the principle of judgeability. \t is 
either (in its simple form) an assertion A’ about a single primary assertion A 
or (in its extended form) a species (A’,) of assertions about the elements of a 
species (A,) of primary assertions saying that each A, can be judged, i.e., can 
either be proved to be true or be proved to be contradictory. 

This principle of judgeability entails the following two corollaries which are 
weaker: 

(i) The principle of testability, being (in its extended form) a species (A”’,) 
of assertions about the elements of the species (A,) saying that each A, can be 
tested, i.e., can either be proved to be non-contradictory or be proved to be 
contradictory. 

(ii) The principle of reciprocity of complementarity, being (in its extended 
form) a species (A’”’,) of assertions about the elements of the species (A,), 
saying that each A,, if proved to be non-contradictory, can also be proved 
to be true. 

In intuitionism, of course, all three of these principles, being assertions 
about assertions, are only then “realized,” i.e., only then convey truths, when 
these truths have been experienced. On this basis it can be proved that the 
extended principles are not only not true, but even contradictory. On the other 
hand, in their simple form, all three of the principles are, although not true, at 
least non-contradictory. 

The assertion of an incompatibility is called a negative assertion. In the field 
of negative assertions, the principle of reciprocity of complementarity is realized, 
and the principles of judgeability and testability are equivalent (9, pp. 1245- 
1246). 

2. The refutation of the principle of the excluded third. The first act of 
intuitionism enables us to construct the linear rational grid. On the basis of 
this, by virtue of the second act of intuitionism, we introduce the linear con- 
tinuum in the following way: By a limiting number we understand a (not neces- 
sarily predeterminate) convergent sequence of rational numbers. Then, regard- 











4 L. E. J. BROUWER 


ing as self-explanatory the meaning of a coincidence of two limiting numbers, 
we call the species of limiting numbers coinciding with a given limiting number, 
a limiting number core. A predeterminate limiting number is also called a sharp 
limiting number, and a limiting number core containing a sharp limiting number 
is called a sharp limiting number core. The species of the limiting number cores 
is called the linear continuum or the continuum. 

In order to furnish examples refuting the principle of the excluded third 
and its corollaries, we introduce the notion of a drift (cf. 9, pp. 1246-1247). 
By a drift we understand the union y of a convergent fundamental sequence 
of limiting number cores ¢;(y), ¢2(y), . . . called the counting cores of the drift, 
and the accumulation number core c(y) of this sequence, called the kernel of 
the drift, all counting cores lying apart from each other and from the kernel. 
(We say that a lies apart from 6 if there is some natural number m such that 
lb — al > 2.) 

Let a be a mathematical assertion so far neither tested nor recognized as 
testable. Then, in connection with the assertion a and with a drift y the creating 
subject can generate an infinitely proceeding sequence R(y, a) of limiting 
number cores ¢:(y, a), ¢2(y, @), . . . according to the following direction: As 
long as during the choice of the c,(y, a) the creating subject has not experienced 
the truth of a [has neither experienced the truth nor the absurdity of a], each 
Ca(y, @) is chosen equal to c(y). But as soon as between the choice of c,_:(y, a) 
and that of c,(y, a) the creating subject has experienced the truth of a [has 
either experienced the truth or the absurdity of a], c,(y, a), and likewise c,,,(y, a) 
for each natural number », is chosen equal to c,(y). This sequence R(y, a) 
converges to a limiting number core C(y, a) [D(y, a)] which will be called a 
conditional checking-core of y through « |direct checking-core of y through a}. 

Let y be a drift whose counting cores are rational and whose kernel is ir- 
rational. Then the assertion of the rationality of a D(y, a) is not judgeable, 
but it is testable, because the assertion of irrationality of D(y, a) would entail 
the simultaneous contradictority of the truth and the absurdity of a, which is 
an absurdity. 

On the other hand, truth of @ and rationality of C(y, a) are equivalent. So 
the assertion of the rationality of C(y, a) is neither judgeable nor testable. For, 
non-contradictority of rationality of C(y, «) would entail non-contradictority 
of a, i.e. testability of a, which was presupposed not to exist. Furthermore, if 
some day a would prove to be non-contradictory without being true, rationality 
of C(y, a) likewise would be non-contradictory without being true. So for 
rationality of C(y, a), just as for a, non-contradictority would not be equi- 
valent to truth. 

Obviously the field of validity of the principle of the excluded third is identical 
with the intersection of the field of validity of the principle of testability and 
that of the principle of reciprocity of complementarity. Furthermore, the first 
field of validity is a proper subfield of each of the others, as is shown by the 
following examples: 





POINTS AND SPACES 0 


Let A be the species of the direct checking-cores of drifts with rational 
counting cores, B the species of the irrational limiting number cores, C the union 
of A and B. Then all assertions of rationality of an element of C satisfy the 
principle of testability, while, as we have seen, there are assertions of rationality 
of an element of C not satisfying the principle of the excluded third. 

Again, all assertions of equality of two limiting number cores satisfy the 
principle of reciprocity of complementarity, whereas there are assertions of 
equality of two limiting number cores not satisfying the principle of the excluded 
third. 

In the domain of mathematical assertions the property of absurdity, like 
the property of truth, is a universally additive property, that is to say, if it holds 
for each element a@ of a species of assertions, it also holds for the assertion 
which is the union of the assertions a. This property of universal additivity does 
not obtain for the property of non-contradictority. However, non-contradictority 
does possess the weaker property of finite additivity, that is to say, if the asser- 
tions p and o are non-contradictory, the assertion +, which is the union of p 
and ¢, is also non-contradictory. 

Applying the latter theorem to the special non-contradictory assertions that 
are the enunciations of the principle of the excluded third for a single assertion, 
we see that a simultaneous enunciation of this principle for a finite number of 
assertions is likewise non-contradictory. 


As to the long belief in the universal validity of the principle of the excluded 
third in mathematics, intuitionism considers it as a phenomenon of the history 
of civilization of the same kind as the old-time belief in the rationality of x or 
in the rotation of the firmament on an axis passing through the earth. And 
intuitionisim tries to explain the long persistence of this dogma by two facts: 
first, the obvious non-contradictority of the principle for an arbitrary single 
assertion; secondly, the practical validity of the whole of classical logic for an 
extensive group of simple every-day phenomena. The latter fact apparently 
made such a strong impression that the play of thought which classical logic 
originally was, became a deep-rooted habit of thought which was considered 
not only as useful but even as aprioristic. 


The above rejection of the universal truth of the principle of the excluded 
third in mathematics will make it plausible that intuitionist arguing requires 
a preliminary formulation of several definitions which sometimes split atomic 
notions of classical mathematics. 

Two mathematical entities will be called different if their equality proves to 
be absurd. The notation for equality and difference will be = and # respectively. 

Two infinite sequences of mathematical entities a), a2, .. . and dy, be, .. . 
will be said to be equal, or identical, if a, = b, for each v, and distinct, if a 
natural number can be indicated (or calculated) such that a, and 6, are different. 

A species is called discrete if any two of its elements can be proved either to 
be equal or to be different. 








6 L. E. J. BROUWER 


If the species M possesses an element which cannot possibly belong to the 
species NV, we shall say that M deviates from N. 

The species M will be called a subspecies of the species N, and we shall write 
M C N if every element of M can be proved to belong to N. If, in addition, 
N deviates from M, then M is called a proper subspecies of N. If each element 
of N either belongs to M or cannot possibly belong to M, then M is called a 
removable subspecies of N. 

Two species are said to be equal, or identical, if for each element of either of 
them an element of the other, equal to it, can be indicated. They are called 
different if their equality is absurd, and congruent if neither can deviate from 
the other. 

Let M be the linear continuum, A and B the species of the rational and the 
irrational limiting number cores respectively, then M and the union of A and 
B are congruent and different at the same time! 

A species which cannot possess an element is said to be empty. Two different 
species whose intersection is empty are called disjoint. 

If M and N are disjoint subspecies of the species P, and the union of M and 
N is congruent to P, we shall say that P is composed of M and N, and that 
M and N are conjugate subspecies of P. Thus, e.g., the species of exponents of 
Fermat's equation which render it solvable and unsolvable respectively, are 
conjugate subspecies of the species of the natural numbers. 

For a given P, for any subspecies M, a subspecies N can be indicated such 
that M and N are conjugate subspecies of P. This N, in general, is not even 
uniquely determined by P and M. Thus, e.g., if P is the linear continuum, and 
M the species of the irrational limiting number cores, then for N we may choose 
the species of those limiting number cores whose rationality is non-contradictory 
as well as the species of the rational limiting number cores. 

If H and K are disjoint subspecies of the species P, and the union of H and 
K is identical with P, so that H and K are conjugate removable subspecies of 
P, we shall say that P splits into H and K. Thus, e.g., the species of the prime 
numbers and of the composite numbers are conjugate removable subspecies 
of the species of the natural numbers. 

For an arbitrary proper subspecies H of P one cannot, in general, indicate 
a K such that H and K are conjugate removable subspecies of P. There are 
even species (e.g., the linear continuum) which possess no removable proper 
subspecies at all. 

If V and W are conjugate subspecies of P, and if in addition V consists of 
those elements of P which cannot belong to W, and W of those elements of P 
which cannot belong to V, we shall say that P is directly composed of V and W, 
and that V and W are directly conjugate subspecies of P. Thus, e.g., the species 
consisting of those elements of P for which a certain negative property is true 
and absurd respectively, are directly conjugate subspecies of P. 

If between two species M and N a (not necessarily predeterminate) 1-1 
correspondence can be created, i.e., if M can be mapped onto N in such a way 





ele 


ur 


POINTS AND SPACES 7 


that equal and only equal elements of M have equal images in N, while each 
element of N is the image of some element of M, we shall say that M and N 
are equtpotential. 

A species which is equipotential to some natural number [to the infinite 
sequence of natural numbers] will be called finite [denumerably infinite). 

A species which contains a denumerably infinite subspecies will be called 
infinite. 


3. Spreads and fans. Spreads and fans are fundamental notions in in- 
tuitionism. Their introduction requires some further definitions. 

By a node of order n we understand a sequence of m natural numbers (n > 1) 
called the constituents of the node. 

A node p’ of order n + m (m > 1) will be called an mth descendant of the 
node ~ of order n, and # will be called the mth ascendant of p’, if the sequence 
of constituents of p is an initial segment of the sequence of constituents of p’. 

If m = 1, p’ will also be called an immediate descendant of p and p the im- 
mediate ascendant of p’. 

The species Q, of the immediate descendants of the node p of order n con- 
sidered in their natural order (i.e., ordered according to their last constituent) 
will be called a row of nodes of order n + 1 and the ramifying row of p, while 
p will be called the dominant of Q,. 

The species of the nodes of order 1 considered in their natural order will be 
called the row of nodes of order 1. 

A finite sequence of nodes consisting of a node p,; of order 1, an immediate 
descendant 2 of p:, an immediate descendant p; of po, . . ., up to an immediate 
descendant p, of p,-1, will be called a rod of order n. 

An infinite (not necessarily predeterminate) sequence of nodes consisting 
of a node p; or order 1, an immediate descendant p2 or p;, an immediate des- 
cendant p; of 2, and so on ad infinitum, will be called an arrow. 

Naturally an arrow may grow in complete freedom, i.e., in the passage from 
p, to P»41, the choice of a new constituent for p,,; to be joined to those of p, 
may be completely free for each v, for as long as the creating subject may desire. 
On the other hand this freedom in the generation of the arrow may at any stage 
be completely abolished, at the beginning or at any ~,, by means of a law fixing 
all further nodes in advance. From this moment the arrow concerned will be 
called a sharp arrow. Furthermore, the freedom in the generation of the arrow, 
without being completely abolished, may, at any ~,, undergo some restriction, 
and this restriction may be intensified at further p,’s. Finally, all these inter- 
ventions, by virtue of the second act of intuitionism, may, at any stage, be 
made to depend on possible future mathematical experiences of the creating 
subject. 

Let p be a natural denumeration of the species of the nodes, i.e., a denum- 
eration 4}, de, . . . of the nodes such that each node comes before its descendants, 
and before the nodes which it precedes in its row of nodes. Then, without 











8 L. E. J. BROUWER 


knowledge of further details of this denumeration, as soon as in p for each a, a 
sequence 

ee ree 
with ever increasing indices, can be indicated as its ramifying row, the sequence 
of constituents of any given a, can be reconstructed. 

An example of a natural denumeration of the species of the nodes can be 
given as follows: Let G, be the species of the nodes of order < nm and constituents 
< n, G,,, the species of the nodes of G, of order v, and A, (m > 2) the species of 
the nodes of G, not belonging to G,_,. Each G,,, is counted in such a way that 
p precedes g if the first constituent in which they differ is smaller for p than for 
q. If we then make each G,,, precede G,,,4: we get a natural denumeration 
A, of G,. Finally, by successively counting G; after A;, A» after As, A; after 
A;, and so on, we arrive at a natural denumeration of the species of the nodes. 

We proceed to consider a (not necessarily predeterminate) species of nodes 
K containing: 

(i) of the nodes of order 1, either all natural numbers or those and only those 
natural numbers which do not exceed a definite natural number mp; 

(ii) for each m > 1, of the nodes of order » + 1 which are immediate des- 
cendants of the node p of order belonging to K, either all of them or those 
and only those whose (m + 1)st constituent joined to those of » does not exceed 
a definite natural number my. 

Such a species of nodes K will be called a spread direction, and the species 
w(K) of the arrows which consist of nodes of K will be called a spread. 

The spread direction for which from the above alternatives always the first 
is chosen is called the universal spread direction, and the corresponding spread 
is called the universal spread. 

A spread direction for which from the above alternatives always the second 
is chosen is called a fan direction, and the corresponding spread is called a fan. 

As each spread direction is a subspecies of the universal spread direction 
USD (just as each spread is a subspecies of the universal spread US), any 
natural denumeration (in the above sense) of USD generates a natural denumera- 


tion of each spread direction. Furthermore, if a,, a, . . . is a natural denumeration 
of a spread direction K, and for each a, a finite or denumerably infinite sequence 
Dy,s Dry ee ey 


with ever increasing indices, can be indicated as its ramifying row, then for 
any given a, the sequence of its constituents in K can be reconstructed. 

A node } of a spread direction K, together with its descendants in K, 
constitutes a removable subspecies 2,(K) of K which will be called a sector 
direction, and the species P,(K) of the arrows composed of nodes of 2,(K) 
will be called a sector. Both x,(K) and P,(K) will be said to be dominated by 
their ‘‘top”’ b. We shall speak of a free sector {sector direction], if b is of order 1, 
and of a horned sector (sector direction] of order n, if b is of order‘n + 1 (nm > 1). 
In the latter case the constituents of the immediate ascendant of 5 will be said 
to form the horn of the sector [sector direction]. 





POINTS AND SPACES 9 


A subspecies of the spread direction K will be called thin if none of its nodes 
is a descendant of any other of its nodes. 

If a (not necessarily predeterminate) subspecies of the spread direction K 
has the property that no arrow of K can avoid it, it will be called a crude block 
of K. A crude block of K which is thin and removable will be called a proper 
block or simply a block of K. 

The nodes of K which are not descendants of the block B(K) of K constitute 
a removable subspecies +,(K) of K which will be called a free stump, and which 
we shall say is carried by the block B(K). 

A node b belonging to r,(K), together with its descendants in r,(K), con- 
stitutes a removable subspecies »0,(K) of r,(K) which will be called a pyramid, 
and which we shall say is dominated by its “top’’ b. We shall speak of a free 
pyramid if b is of order 1, and of a horned pyramid of order n if 6 is of order n + | 
(n > 1). In the latter case the sequence of constituents of the immediate 
ascendant of 6 will be said to constitute the horn of ,o,(K). 

If from the free pyramid ,og,(K) [from the horned pyramid ,o¢,(K) of order 
n| we take away the top b, the remainder ,p,(K) (also in the case of its reducing 
to “nothing,” if 6 belongs to B), will be called a horned stump of order 1 [of 
order m + 1]. The constituents of the removed top 6 will be said to constitute 
the horn of »pe(K). 

If from all nodes of a horned stump »pg(K) the horn is taken away, the remainder 
will be a free stump »tg(K). This holds also in the case of b belonging to B, 
if ‘‘nothing”’ is added to the species of the free stumps. If ,p,(K) was of order 
n, we shall call ,7,(K) a free substump of rzg(K) of rank n, dominated by b. 

To explain the notion of absorption of a row of free substumps of rank n by a 
free substump of rank n — 1, let by, be, . . . be a row of nodes of order nm, dominated 
by the node a, and for each » let 8, be the last constituent of b,. For each », 
to each node of »,rz3(K) we add 8, as a first constituent, and to the horned 
stump of order 1 thus acquired we add the node 8,, thus arriving at a ‘‘row’”’ of 
free pyramids o;, a2, . . . whose union is ,rg(K), a free substump of r_(K) of 
rank n — 1. This process of absorption can also be effected if some or all of the 
»,72(K) reduce to nothing. 

In an analogous way, by absorption of a finite sequence or a fundamental 
sequence of free stumps of spread directions K, a free stump of a new spread 
direction K comes into being. 


4. Well-ordered blocks and stumps. At this point, before continuing the 
study of spreads and fans, we have to insert some considerations about well- 
ordered species. 

A discrete species D is said to be completely ordered if for any two different 
elements of D, say a and 6, one of the two mutually exclusive relations a < } 
(equivalent to b > a) and a > b (equivalent to b < a) is realized, in such a 
way that a < b, a =r and b = s implies r < s, and a < b and b < ¢ implies 
a<c. 











10 L. E. J. BROUWER 


Let R be a fundamental sequence [an ordered finite species] of disjoint com- 
pletely ordered species V,. We construct a complete order of the union M of 
the N, in the following way: Let e’ belong to N’ and e” to N”’. Then we put 
e’ < e”’ in M if either N’ < N” in Ror N’ = N” = Nandeé’ < e” in N. Denot- 
ing the species M ordered in this way by M, we write 


M =N,+N.+...(M=Ni+N2+...+N,] or M = ¥N,, 


and we shall say that M is the ordinal sum of the N,. The generation of an ordinal 
sum will be called ordinal addition. 

On the basis of this definition of ordinal addition we can generate a con- 
tinually extending stock of well-ordered species according to the following rules: 

(1) Each species containing one and only one element is a well-ordered 
species, and, as such, will be called a basic species. 

(2) If, out of the available stock of well-ordered species previously acquired, 
a fundamental sequence of disjoint well-ordered species has been indicated, 
their addition will be called a first generating operation, and their ordinal sum 
will again be called a well-ordered species and, as such, will be added to the 
stock. 

(3) If, out of the available stock of well-ordered species previously acquired, 
a non-vanishing ordered finite sequence of disjoint well-ordered species has 
been indicated, their addition will be called a second generating operation, and 
their ordinal sum will again be called a well-ordered species and, as such, will 
be added to the stock. 

In the case that only the second, not the first, generating operation is effected, 
we speak of bounded well-ordered species. 

Let F be a well-ordered species. All well-ordered species which, at some stage, 
have played a part during the construction of F will be called constructional 
subspecies of F. The constructional subspecies of F which have played a part 
in the final generating operation of F, will be denoted by F, (v passing through the 
sequence of natural numbers or through an initial segment of it) and will be 
said to constitute the row of constructional subspecies of order 1 of F. The con- 
structional subspecies of order 1 of F,, will be denoted by F,,, (v varying as 
above) and will be said to constitute a row of constructional subspecies of order 2 
of F. In general, the row of constructional subspecies of order 1 of F,,. . .,, 
will be denoted by F,,. . .,,, (vy varying as above) and will be said to constitute a 
row of constructional subspecies of order k + 1 of F. F itself will be considered as 
its own constructional subspecies of order zero. 

In this way each basic species, that is, each element, of F, and each con- 
structional subspecies of F, turns out to be a constructional subspecies of finite 
order (which order, however, for appropriately chosen constructional subspecies 
may increase indefinitely. This property is easily proved by the inductive method, 
i.e., by remarking that it holds if F is a basic species, and that when a generating 
operation is performed, it holds for the generated ordinal sum if it holds for 
the terms of the sum. By the same method we state that the species of sequences 


— an co 





POINTS AND SPACES 11 


of indices of the constructional subspecies of a well-ordered species is a removable 
subspecies of USD, that every well-ordered species in whose construction the 
first generating operation has been effected at least once is denumerably infinite, 
and that every bounded well-ordered species is finite. 
It is also by the inductive method that we shall prove the following theorem: 
For each well-ordered species F there is a 1-1 correspondence between the species 
of its constructional subspecies of non-vanishing order and a free stump + such 
that each sequence of indices of a constructional subspecies of F corresponds to an 
equal sequence of constituents of a node of +, while a basic species of F corresponds 
to a node of the block carrying +, and the union of an F, and its constructional 
subspecies corresponds to a free pyramid of r. 
For, let 
| a? a Pe 
be a row of constructional subspecies of order n of F, and for each », let 
F,,,...00-s9 


be provided with a 1-1 equality-mapping of the sequences of the indices following 
v of its constructional subspecies onto a free stump r,,(K,) (containing as 
constituents of its nodes only indices of order > n from F). Then the row 


tp, (Ki), te,(Ke),... 

can be considered as a row of free substumps of rank 1 of a free stump r,(K), 
by which it can be absorbed. Accomplishing this absorption, and assigning 
to each sequence of indices following »,; of a constructional subspecies of 
F,,. . .»,-, an equal sequence of constituents of a node of r,(K), we arrive at a 
1-1 equality-mapping of the sequences of the indices following »,_; of the con- 
structional subspecies of F,,.. .,,_, onto rg(K). And if F,,. .,,_, is a basic 
species of F the mapping as required by the theorem exists as a mapping of 
nothing onto “nothing.” 

Blocks and free stumps which can play the part of a B(K) and a r,(K) as 
required by the above theorem will be called well-ordered blocks and well-ordered 
free stumps respectively. The free pyramids which are contained in a well- 
ordered free stump will be called well-ordered free pyramids. Horned pyramids 
which after removal of their horns become well-ordered free pyramids will be 
called well-ordered horned pyramids. 

Obviously each free stump corresponding to a bounded well-ordered species 
is finite. 

The above assignment of a sequence of indices to each constructional sub- 
species of non-vanishing order of a well-ordered species F was performed in a 
downward direction, but the same result can be obtained as well by an upward 
construction consisting in a gradual dressing-up of F parallel to its generation, 
according to the following prescriptions: 

(i) At each ordinal addition of an ordered sequence d of basic species, to 

each of these species is assigned, as its only index, the natural number 
indicating its place in d. 











—_ 
tb 


]. BROUWER 


(ii) At each ordinal addition of an ordered sequence d of well-ordered species 
previously acquired, for each of these species (and for each of their 
constructional subspecies) the natural number indicating its place in d 
is added to its adhering sequence of indices previously acquired, as a 
first index. 

If, in an analogous way, for a given spread direction K in which a thin sub- 
species C(K) has been indicated, we succeed in arriving at the free stump 
tp(K) by allowing in K the following sorts of acts: 

(i) the qualification of a node of C(K) as dominating a free substump 

“nothing,” 

(ii) the formation of a free substump of rank n — 1 by absorbing a row of 

free substumps of rank n, 
then this gradual erection of the edifice of nodes of r,(K) (proving by the way 
that C(K) = B) is identical with the above upward construction of the edifice 
of sequences of indices of the constructional subspecies of a proper well-ordered 
species F, so that r,(K) is a well-ordered free stump and we may speak of a 
well-ordered erection of rz(K). 

By extending a given free stump to its spread direction we see that a natural 
denumeration of the latter yields a natural denumeration of the former. So 
also the species of the sequences of indices of the constructional subspecies of a 
well-ordered species can be denumerated in a natural way. 

We shall show by an example that not every block is a well-ordered block, 
and hence that not every free stump admits of a well-ordered erection. 

Let K be a spread direction, and let 8, be the species of the nodes of K of 
order v. Let a v-union be a union of species 8, with regard to which an infinite 
sequence of decisions g:, g2, . . . successively decides whether 8, belongs to the 
union, whether 8, belongs to the union, and so on, and let V be the species of the 
v-unions. Let a be a mathematical assertion so far neither tested nor recognized 
as testable, and let v, be the element of V generated as follows: As long as in the 
course of the successive choices of the decisions g, the creating subject has 
neither experienced the truth nor the absurdity of a, each g, will be chosen to 
be negative; but as soon as between the choice of the decision g,_; and that of 
the decision g, the creating subject has experienced either the truth or the 
absurdity of a, g, will be chosen to be affirmative and for each natural number 
v, Gr+» Will again be chosen to be negative. : 

Obviously this v. is a block of K of which we cannot say that it is a well- 
ordered block. 


5. The fan theorem. If a (not necessarily predeterminate) subspecies 
C(K) of the spread direction K has the property that every arrow of K meets 
C(K), i.e., has a node in common with C(K), this subspecies C(K) will be called 
a crude bar of K. A crude bar of K which is thin will be called a proper bar or 
simply a bar of K. 

The definition of a crude bar means that for every arrow a of K the order 





Ss Fe == w @ Te oe UP 


POINTS AND SPACES 13 


n(a) of the postulated node of intersection with C(K) must be computable, 
however complicated this calculation may be. For instance, the algorithm 
in question may indicate the calculation of a maximal order m,; at which will 
appear a finite method of calculation of a further maximal order mz at which 
will appear a finite method of calculation of a further maximal order n;, at 
which will appear a finite method of calculation of a further maximal order n, 
at which the postulated node of intersection must have been passed. And much 
higher degrees of complication are thinkable. 

If C(K) is a crude bar of K, then every node ¢ of K has either been recognized 
as belonging to C(K) or been provided with a constructive mathematical argu- 
ment h, proving that ¢ is barred by C(K), i.e., that every arrow passing through 
t has a node of intersection with C(K). 

For this mathematical argument hf, no other basis is available than the 
characterization of C(K), and the species of constructional relations existing 
between the nodes of K. Now all these relations can be derived from the basic 
relations which for each node indicate its immediate predecessor in its row 
of nodes (or the non-existence of an immediate predecessor in its row of nodes), 
its immediate successor in its row of nodes (or the non-existence of an immediate 
successor in its row of nodes), its immediate ascendant (or the non-existence of an 
immediate ascendant), and the row of its immediate descendants. (Whether 
this system of basic relations is susceptible of further reductions, we shall 
leave undecided.) Consequently, if we split up the argument /, into an argu- 
ment k, consisting exclusively of statements of atomic basic facts d and atomic 
immediately obvious inferences e, then, supposing ¢ = »; ...v,, the final inference 
of k, must deduce the barred condition of ¢ either from ¢ being recognized 
as belonging to C(K) or from the barred condition of »; . . . v,, (a so-called 
{-inference) or from the barred condition of », . . . vA for each \ (a so-called 
-inference). lf, in particular, ¢ is a node » of order 1, the final inference of k, 
recognizing that ¢ is barred, must either be the recognition of ¢ as a node of 
C(K) or the ¢-inference deducing the barred condition of », from the barred 
condition of »;\ for each \. So in the latter case the recognition of the barred 
condition of v; has been preceded in k, by the recognition of the barred condition 
of »;\ for each \. From this follows that in k, the recognition of the barred 
condition of k,,,, preceding that of k,, must in its turn either be based on its 
belonging to C(K) or have been preceded by the recognition of the barred 
condition of k,,,, for each \, from which it has been deduced by a -¢-inference; 
and so on. 

Consequently, if ¢ is a node of order 1, then in k, appear 

(1) a certain species of nodes N,, including ¢ and a certain thin subspecies 

C,(K) of C(K), 
(2) the species S, of the statements of the barred condition of an element 
of N,, 

(3) a species J, of ¢-inferences connecting elements of S, 

such that each element of S, is connected with the statement of the barred 











14 L. E. J. BROUWER 


condition of ¢ by a finite sequence of elements of J,, that each element of S,, 
with the exception of the statements of the barred condition of an element of 
C,(K), has a row of predecessors in the argument with which it is connected 
by an element of J,, and that each element of S,, with the exception of the 
statement of the barred condition of ¢, has a successor in the argument with 
which it is connected by an element of J,. 

If we now take for ¢ successively each node of order 1 of K, and consider the 
union k of the corresponding arguments &,, then in & appear 

(1) a certain species of nodes N, including all nodes of order 1 of K and a 

certain thin subspecies Cy(K) of C(K), 

(2) the species S of the statements of the barred condition of an element of N, 

(3) a species J of ¢-inferences connecting elements of S 
such that each element of S is connected with the statement of the barred 
condition of a node of order 1 of K by a finite sequence of elements of J, that 
each element of S, with the exception of the statements of the barred condition 
of an element of C,(K), has a row of predecessors in the argument with which 
it is connected by an element of J, and that each element of S, with the exception 
of the statements of the barred condition of a node of order 1 of K, has a suc- 
cessor in the argument with which it is connected by an element of J. 

In this way from the argument k we have extracted an argument k’ which by 
performing acts of the two following sorts in K: 

(i) taking an element of C)(K) as a basic pyramid consisting of barred nodes, 

(ii) taking the union of a row of pyramids consisting of barred nodes previously 

acquired and the dominant of their row of tops, thus obtaining a new 
pyramid consisting of barred nodes, 
has arrived at a row of free pyramids consisting of barred nodes whose row of 
tops is the row of nodes of order 1 of K. 

This argument k’ comes to the same as the argument k” which by performing 
acts of the two following sorts in K: 

(i) assigning to an element a of Cy(K) a free substump ‘nothing’ dominated 

by a, 
(ii) having a row of free substumps consisting of barred nodes previously 
acquired, absorbed by a new free substump, 
has arrived at a free stump of K consisting of barred nodes. 

So, as was shown in $4, this argument k”’ in its turn comes to the same as the 
well-ordered erection of the species of nodes N as a well-ordered free stump of K, 
carried by the well-ordered block Cy(K). 

With which we have deduced the 


Bar THEOREM. Every crude bar contains a well-ordered block. 


1Cf. (6, pp. 63-65). The species w: used there plays the role of the above species C(X). 
The equivalence of the principles of the excluded third and of reciprocity of complementarity, 
mentioned there in a footnote by way of remark, subsequently has been recognized as non- 
existent. In fact, as was also shown in the present paper, the fields of validity of these two 
principles have turned out to be essentially different. 


, 
, 





POINTS AND SPACES 15 


This theorem does not imply that every well-ordered block is a bar. 

In the case that K is a fan direction, its well-ordered free stumps, on account 
of their correspondence to bounded well-ordered species, are all finite; so the 
above species of nodes N is finite, and there will be a finite maximum O(N) 
for the order of its nodes. Furthermore in this case the well-ordered block Cy(K) 
is a bar. 

Now we easily prove the 


FAN THEOREM. Let K be a fan direction, and let us suppose that to each arrow 
a of K has been assigned a natural number y(a). Then a natural number s can be 
indicated such that, for any a, u(a) is determined at the sth node of « (6, p. 66; 10, 
p. 143). 

For, since the natural number in question has to be known for each arrow 
of K at one of its nodes, the nodes yielding this knowledge constitute a species 
of nodes which each arrow of K is bound to meet, and which therefore is a 
crude bar C(K) of K. Because this C(K) contains a well-ordered block C,(K), 
and this well-ordered block Cy(K) in the present case is finite and a bar of 
K, a maximum s can be indicated for the order of its nodes, so that each arrow 
a of K meets Cy(K) not later than at its sth node. Hence, for each a, at its 
sth node, u (a) is determined. 


6. The continuity theorem. The infinite sequence of natural numbers 
passes into a located infinite sequence c, C2, . . . if for any two of its elements 
c, and c, a symmetric limiting number core function p(c,, c,), called the distance 
of c, and ¢,, is indicated, which has the following properties: 

(1) For c, = ¢,, p(c,, cs) = 0. 

(2) For c, A c,, a natural number f(c,, c,) can be indicated such that 


p(c,, c,) > g-Mer. ce) 


(3) pcr, Cs) < ple, Cx) + p(s C1). 

(4) For each m a natural number yu(m) can be indicated such that, if we 
denote the union of ¢1, cz, . . . and Cyn) by Hm, then p(c,, ¥,) < 4-” for each »v. 

We shall express the property (4) by saying that the sequence c,, C2, . . . is 
approximated with any degree of accuracy by its successive initial segments. 

Let L be a located infinite sequence. An infinite sequence @;, a2, . . . of elements 
of L (among which equalities may occur) will be said to be convergent if for each 
n a natural number y(m) can be indicated such that 


p(Aym), a,) < . 


for any vy > y(m). A convergent infinite sequence of elements of L will also be 
called a limiting element of L. Regarding as self-explanatory the meaning of 
coincidence of two limiting elements, we shall call the species of the limiting 
elements of LZ coinciding with a given limiting element of L, a point core of L. 


The species RL of the point cores of L will be called a located compact topological 
space. 











16 L. E. J. BROUWER 


If in a spread direction [fan direction] and in the corresponding spread [fan] 
each constituent of a node is replaced by some mathematical entity in such 
a way that in each node ry. . . %%41 the constituents v;,v2, . . . , v are replaced 
by the same mathematical entities as in the node yy... . v,, the result of this 
process will be called a dressed spread direction {dressed fan direction] with a 
corresponding dressed spread {dressed fan). 

Let us consider a dressed fan direction SRL whose row of nodes },, bs, . . . of 
order 1 consists of the elements of ¥;, whilst each element of a row of nodes 


Diasee a Pals 


of order m + 1 consists of the immediate ascendant of the row followed by 
a constituent for which is chosen successively each element of a subspecies of 
¥n41 Which, though arbitrary to a certain extent, must include all elements of 
¥n+1 at a distance < 2.4~" from the last constituent of },, _,,, and must exclude 
all elements of ¥,4; at a distance > 3.4-" from the last constituent of 5,, _,,. 

Each arrow of SRL defines a limiting element of L. For in each arrow of SRL 
each accretion of order n, i.e., each last constituent of a node of order m has a 
distance less than 


9 


(3.4° + 3.4°" + 3.477% 4+...) 247 


from each of its descendant accretions of order > n. 

Each limiting element of L coincides with an arrow of SRL. For, let ay, a2, . . . 
be a limiting element \ of L, and »; < v2. < v3. . . an infinite sequence of in- 
creasing natural numbers such that 


p(a,,,a,) <4" 


for any v > »,. If to each a,, we assign an element oa, of y, at a distance 
< 4° + 4-*"' from a,,, then from 


(on, Gr.) 4° +4", pGn, G4.) <4, pdr... 0041) < 4°" + 4" 


follows p(¢,, ¢n41) < 2.4-", so that the infinite sequence o1, o2, o3, . . . generates 
an arrow of SRL which, because p(o,, a,,) < 4-" + 4-*", coincides with X. 

If two limiting elements A; and ), of L are at a distance < 4-"~* from each 
other, then the distance of their respective a,, is < 3.4-"-*. Hence we can assign 
the same ¢o, to both these a,,, so that A; and A, correspond to two arrows of 
SRL which have an accretion of order m in common, and with which they 
coincide respectively. 

On the other hand, two point cores of RL coinciding with two arrows of 
SRL respectively which have a common accretion of order m, are at a distance 
< 2.4-"*' from each other. So that we have proved: 


LemMA 1. To each natural number p; a natural number p, can be assigned 
such that any two point cores of RL whose distance is < 2-”+ contain respectively 
two arrows of SRL which have their rod of order p; in common. 


And conversely : 





POINTS AND SPACES 17 


LEMMA 2. To each natural number p, a natural number p. can be assigned 
such that any two arrows of SRL which have their rod of order p2 in common belong 
respectively to two point cores of RL which have a distance < 2-* from each other. 


Let R’L’ and R”L” be two located compact topological spaces, and let J 
be a full mapping of R’L’ onto R”L”, i.e., an assignment of a point core of 
R’L” to each point core of R’L’. Such a full mapping implies the assignment 
A of an arrow ¢(E’) of S’R”L” to each arrow E’ of S’R’L’ in such a way that 
to coinciding arrows E’ coinciding arrows ¢(£’) are assigned. 

Applying the fan theorem to this assignment A, we obtain: 


LemMMA 3. To each natural number p, a natural number p,; can be assigned 
such that the rod of order p2 of o(E’) is for each E’ determined by its rod of order 
ps, so that to any two arrows of S'R'L’ containing the same rod of order p; two 
respective arrows of SRL” are assigned by A which contain the same rod of 
order po. 


By successive application of Lemmas 2, 3, and 1 we find that to each natural 
number p,; there corresponds a natural number #, such that to each pair of 
point cores of R’L’ whose distance is <2-** the mapping / assigns a pair of 
point cores of R’’L” which have a distance < 2-”: from each other. 

This result establishes the 


CONTINUITY THEOREM. Every full mapping of a located compact topological 
space onto another located compact topological space is uniformly continuous. 


In particular, a bounded function of a compact segment of the linear continuum 
is uniformly continuous (6, p. 67; 10, pp. 145-146). 


REFERENCES 


1. H. Weyl, Uber die neue Grundlagenkrise der Mathematik, Math. Z., 10 (1921), 39-79. 

2. A. Dresden, Brouwer's contributions to the foundations of mathematics, Bull. Amer. Math. 
Soc., 30 (1924), 31-40. 

3. R. Wavre, Y a-t-il une crise des mathématiques? Revue de Métaphysique et de Morale, 
31 (1924), 435-470. 

4. ———.,, Logique formelle et logique empiriste, Revue de Métaphysique et de Morale, 33 
(1926), 65-75. 

5. P. Lévy, R. Wavre, and E. Borel, Discussions, Revue de Métaphysique et de Morale, 33 
(1926), 253-258, 425-430, 545-551; 34 (1927), 271-276. 

6. L. E. J. Brouwer, Uber Definitionsbereiche von Funktionen, Math. Ann., 97 (1921), 60-75 

7. ———, Wissenschaft, Mathematik und Sprache, Monatsh. Math. Phys., 36 (1928), 153-164. 

8. G. Mannoury, La question vitale: A ou B?, Nieuw Archief Wisk. (2), 21 (1943), 161-167 

9. L. E. J. Brouwer, Consciousness, philosophy and mathematics, Proc. Tenth International 
Congress of Philosophy (Amsterdam, 1948), 1235-1249. 

10. —, Historical background, principles and methods of intuitionism, South African 
J. Sci., Oct.-Nov. 1952, 139-146. 


Blaricum, Holland 











NOTE ON THE GAMMA FUNCTION 


B. VAN DER POL 





1. The gamma function I['(z + 1) = I(z) has been defined in different 
ways:! 
x z/k 
1 Il = 2 . (Weierstrass) 
(1) (2) =e LlitgeE 
y (1 + 1/k)* : 
2 Il(z) = ee Th IR (Euler) 
(2) (z) 1b ok 
j N 1 } : 
‘ I(z) = lim 4 N’ — (Gauss) 
(3) Is) = lim \N"IT as 
(4) II(z) = | e’s'ds (Rz> —1), (Euler) 
0 
. 0 / 
(5) (2) = exp a {f(s,2 + 1) — £(s, 1)} m0 (Lerch) 
where, for Rs > 1, ¢(s, z) is given by 
: - l 
i(s,z) = > ——; 2>0 
(5a) f(s, 2) > Gib (s > 0) 
which is Hurwitz’s generalization of Riemann’s zeta function 
— |] 
(5b) f(s) = ¢(s,1) = > =; (Rs > 1). 
rai R 


The gamma function has also been defined by Harold Bohr and Mollerup 
(2, 154, 161-163) and Artin (1) as that solution of the difference equation 


(6) r(x + 1) = xI(x) 
which is normalized through 
r(l) = 1, 


and which is logarithmic convex for x > 0. 
Finally, we have the well-known Stirling formula 


(7) II(z) = V/ 2x2 z’e* (Stirling) 
where 

"?Is] — s+ 
(7a) u(z) = j is} fsa 


or, asymptotically, 
Received May 12, 1953. 
‘An up to date review of the gamma function is to be found in (3). 


18 





3S) 


up 


NOTE ON THE GAMMA FUNCTION 19 


_ B, B, Be 
(7b) u(z) = i 9, + 3.423 + 5-62" Wésce 








In (7b), B, are the Bernoulli numbers defined by the generating function 
t =f 
Fai =X By: 


Other expressions for u(z) are 








7. -~f omit _1,1)\% we, 

(7c) ue) = fre (L. Ly 1)¢ (Rz > 0) 

(7d) p(s) = ¥ 1G + k + 4) log (1 + 1) _ i} (Gudermann) 

and 

(7e) u(z) = af a tan™*(3) ds (Rz> 0) (Binet). 
oe —1 Z 


The expressions (7a), (7b), and (7d) are valid in the whole z-plane cut from 
— @ to 0 along the negative real axis, the values being principal values. 

To this list we now add the following formula which will be the subject of 
this note: 


@ 
an Cr+ . . OC, C241 C242... + Coin 
(8) M(z) = ze *T] (css) amg’ 67" lim —2oeth Seth ~~ Seen 
k=d \ Cx ae Sa 


where e, = (1 + 1/a)*, and é) = 1. 


2. In order to derive (8), we first write (1) as 


Next we express the sum as a Stieltjes integral 


_sse_ ( 2) 1 
log (2) + v= 2) 5 log I+; {- 
—fz 
log II(z) + yz = f i: — log (; + ) d{s] 
1—o (S$ Fy : 
and transform the latter as follows: 


= JAE we (1+ fac — + SAE we (e+ ha 


(9) Hence 


ite 


“ 


log II(z) + yz -f" F ~ toe (1 +*)f ads) — s) + (g + 1) log (¢ + 1) —=. 
1—o \S SJ J 


We further have for the Euler constant: 


» & 3 @ 
7,= Jo tas Ss), 


which enables us to simplify (9) to 











20 B. VAN DER POL 


log T(z) = (¢ + 1) log (g¢ +1) —2z—- ” leg 1+5 d({s] — s). 
1—0 5 


The Stieltjes integral can now be integrated by parts, the integrated part 
being — log(z + 1). Hence we are left with 


er 1 1 
@ ek+1 l 1 
ine = slog +1) 2+ fe (4, - Sa 


=z log (+1)-2+ D4} (+s) log (14,1.)-« log (144)f. 


Taking exponentials we have 


(11) M(z) = (¢ + 1)'e"T] (é:2) ' 
k= ex 
where e, = (1 + 1/a)*. If we further define e) = 1, we obtain as our final 
expression 
(8) M(s) = se“ T] (ss) 
k=0 \ Cy 


which is valid in the whole z-plane cut from — © to 0 along the negative real 
axis, the values being principal values. 


3. Comparing (8) with Stirling’s formula (7) we note that both contain the 
factor z*e~*. Hence after splitting off this factor we arrive at the interesting 
relation 


(12) Vine” =T] (cess) ; 


k=0 & 


4. It is easy to show that our expression (8) satisfies the difference equation 
(zg + 1) = (¢ + 1)M(z). 


To this end we rewrite (8) with z replaced by z + 1 and transform as follows: 








1 —(z+1)}:; €241 C242~~ - 24+ n4+1 
M(z +1) = (¢ +1)" 6 lim z z+ 2+N+ 
N+ €9@,...€n 
—_—e 2 a C€,€241.-...-+- zn 
= 6+ 1 ee — oe ee taen 
Cz Naw €9@,...€N 
—(z+1) 4 Cz, C241... -+ Coin, 
= (z+ 1)2’e im A lim 654-941 
Naw €00€1...€N 
—(z+1) 4. €.€ 1--- Czan 
= (¢+1)2° 6%” lim =e 
Now €001...€N 


= (+ 1) I(z). 





NOTE ON THE GAMMA FUNCTION 21 
5. A check of (8) can be obtained from (11). The latter expression can be 


written 
(e+2) "(e+)" (2+2 7 
is) = (e+ 1)°o7* fis OTN EN 


Ne (2) (3 2 (z + i)" 
1 2 aes N 





= (¢ + N +1)! rae ( 4.) 
= @ a +H 





Il 
A) 
5 


Il 

% pee 

55 
— 
i= 

_ | 
+ - 
as 


this being Gauss’s formula (3). 


6. The infinite product in (8) is convergent only as it stands, whereas the 
infinite products occurring in the numerator and denominator separately would 
diverge. This however is simply overcome if we rewrite (10) as follows: 


i) 1 l 
(13) log (sz) = z log (¢ + 1) = 2+ Ef e+ 2) tog (1 +t) = | +a 
1 \ 


= § 1 
~ ¥ {etog (1 +2) -1 4 Ah. 


where now each of the two series converges by itself. The second, numerical, 
series can at once be evaluated. One finds 


\ 


J, = f 1 1 ’ 
(13a) } biog (1 +2) -1 +2} = 14 40 tog 28). 


Combining (13) and (13a), and taking exponentials, we now obtain, instead 
of (8): 


(14) II (z) = \/ 2s (2 + 1) er Tl (ss e¥) 


k=l (4 


Using again for y the expression 


N 


1 \ 


we obtain from (14) 











22 B. VAN DER POL 


(2) =/ 2x (2 + 1)°e“*” lim VNIT (sx) 


aco \ k=l ée 
— 1 —— | ; 
=f 2x (2 + 1)° et” © tim ; V NII (cess) 
C2 Naw \ k=0 e 


or 
——_—_ N ) 
(15) IT (z) =V/ 2r2'e* lim 4 V NT] (css) 
No k=0 e 
which is another expression for the gamma function and which can be compared 


with (8). 


Finally, from (15) and using Stirling’s formula (7) we obtain as companion 
relation to (12): 


(16) Vie = lim ‘ V/NII (css) : 


REFERENCES 


1. Ek. Artin, Einfiihrung in die Theorie der Gammafunktion, Hamburg. Math. Einzelschrift, 
11 (Leipzig, 1931). 

2. H. Bohr and J. Mollerup, Lzrebog i matematisk Analyse (Copenhagen, 1922). 

3. F. Lésch and F. Schoblik, Die Fakultdt (Leipzig, 1951). 


22 Chemin Krieg, Geneva 





THE FIRST FACTOR OF THE CLASS NUMBER OF A 
CYCLIC FIELD 


LEONARD CARLITZ 


1. Introduction. Let p be a fixed prime > 3. The first factor of the field 
R(¢), where R is the rational field and ¢ = e**‘’”, is determined by means of 


(1.1) h = (2p) *?-#(Z)f(Z") . . . (2), 
where 

f(x) = ro trxtrox’ +...4+ Toa” < 
Z = e*'/*-)) +r is a primitive root (mod p), and r, is the least positive residue 
of r‘ (mod ). Vandiver (4) has proved that if m is an arbitrary integer > 1, 
then 
(1.2) h = 29? pT] Bass (mod 9"), 


where s = 1, 3,...,  — 2, and the B,, are the Bernoulli numbers in the even 
suffix notation. 

Let p — 1 = ab, where b is odd and greater than 1. Consider the cyclic field 
K C R(&) of degree a over R. The class number of K can be expressed in the 
form h,-4/R, where (1, p. 332) 


(1.3) he = 2°***p-* TT £(Z™), 


where w = 1,3,..., a — 1, and f(x) and Z have the same meaning as in (1.1). 
The numbers A, and A/R are the first and second factors, respectively, of the 
class number of K; since the second factor is not used below we omit the precise 
description of this number. Beeger (2) has proved that h, is a rational integer. 
In the present note we prove the formula 


(1.4) h,= eaten | | Boups+1 (mod p"), n > 1, 


where u = 1, 3, ... , a — 1. The proof is similar to that of (1.2). Note that 
(1.3) does not include (1.1); also for b > 1, (1.4) does not reduce to (1.2). 


2. Proof of (1.4). As in (4), let 
(2.1) p= (Z —1, p) 
denote one of the prime ideal factors of (p) in the field R(Z). Then for arbitrary 
k, m we have 
far (p"*") 


so that 





Received December 19, 1952 











24 LEONARD CARLITZ 


(2.2) f(Z") = f(Z"?") = f(r”) Fr”. 
Thus by (1.3) and (2.2) 
(2.3) ph, = 2 TT f(r™”*) (mod p”*’), 
where u = 1,3,...,a—1. 
Now take m = n + 3a, and (2.3) becomes 

he = Fees iT fr") (mod p"*). 

Since by Fermat’s theorem 
"= (mod p"*"), 


it follows that 
(2.4) h, = 23° p-* I] f(r™”’) (mod p”). 


In the next place, it follows from r‘ = r, (mod p) that 


ae _ kp* 


r =r; (mod ” aie ), 
so that 
p-2 p-2 
2.5 rr") = re" = je mod p"*"), 
‘ b 
i=0 i=—0 
But since the numbers fo, 7;, ... , ’,—2 are a permutation of the numbers 1, 2, 
P pe 
...,P — 1, it is clear that (2.5) is the same as 
p-1 
(2.6) fr") = > e"*" (mod p"*"). 
k=1 


Now using the well-known formula 


p-1 


k=l m-+ 1 : 


where B,,,:(x) denotes the Bernoulli polynomial of degree m + 1, it follows 
easily that 


(2.7) FP") = PBouprsi (mod p"**). 
Substituting from (2.7) in (2.4) we get 


pm = Bmrild) — Bm 


h, = Qe” I] Boup+1 (mod p"), 


which is the same as (1.4). 

3. Some special cases. If we take m = 1, (1.4) becomes 
(3.1) h, = ie Brot (mod p). 
But by Kummer’s congruence (3, chap. 14) 


‘ Bouts _ Bows 
(3.2) bup+1 ~ bu +1 (mod p), 


so that (3.1) reduces to 





THE FIRST FACTOR OF A CYCLIC FIELD 2! 


wt 


(3.3) h, = 2? TT ey) onset 9). 


It follows at once from (3.3) that the first factor of the class number of K 1s 
divisible by p if and only if the numerator of at least one of the numbers 
Bouse Eo. eee a-— 1) 
is divisible by p. 
Let p =3 (mod 4), a = 2, 6 = $(p — 1). Thus K is the quadratic field 
R(( — p)#). Since the class number of K is now determined by 


1S i(k 
: »~ - 1548). 
(3.4) aA: 


P t=1 
where (k/p) is the Legendre symbol, comparison of (3.4) with (1.3) shows 
that in this case hy = — h. Also we see at once that (1.4) reduces to 
(3.5) he = — Byg—-wp-41 (mod p”). 


In particular (3.5) includes the well-known formula 
(3.6) h = — 2Byoi» (mod p). 


(Since 1 < h < p, it is clear that B 4941) # 0 (mod p); indeed this is a con- 
sequence of the fact that $(p — 1) is odd, as is evident from (3.7).) 
Again, in place of (3.4) let us use 


(3.7) ‘= ‘2 7 (2)} = (?). 


Since it follows from 
RPo-” = (4) (mod p) 


Ate kb 
pie-Der — (4) (mod p"*"), 
> Pp 


it is evident that (3.7) implies 


ff. . \- }(p—1) a j. (2) B,,(4(p + 1)) — Bn 
@s) b= {2-(2)5 dF =? -\/S " 


that 


(mod p"*"), 
where m = }(p — 1)p" + 1. Using the formula 


Bn(}p) + Bn(tp + 4) = 2° "Ba (p), 


we get 
l—m j 2 { a+i 
B,(4(p + 1)) = 2° “B, — B, = (2) — 1¢B,, (mod p""’), 
so that (3.8) yields 
(3.9) h=- Bn =— Bn ; (mod p”*' 








26 LEONARD CARLITZ 


It can be shown, using Kummer’s congruence, that (3.5) implies (3.9); also 
form = 0, (3.9) is identical with (3.6). 


REFERENCES 


1. N. G. W. H. Beeger, Over de deelichamen van het cirkellichaam der l"-de machtswortels uit 
de eenheid en hunne klassenaantallen (\ste gedeelte), Konink. Akad. Wet. Amsterdam, 
27 (1918), 324-336. 


2. ———, Over de deelichamen van het cirkellichaam der l*-de machtswortels uit de eenheid en 
hunne klassenaantallen (3de gedeelte), Konink. Akad. Wet. Amsterdam, 27 (1918), 
822-827. 

3. N. Nielsen, Traité élémentaire des nombres de Bernoulli (Paris, 1924). 


a 


. H.S. Vandiver, On the first factor of the class number of a cyclotomic field, Bull. Amer. Math 
Soc., 25 (1918-19), 458-461. 


Duke University 








CORE-CONSISTENCY AND TOTAL INCLUSION FOR 
METHODS OF SUMMABILITY 


G. G. LORENTZ anv A. ROBINSON 


1. Introduction. We shall consider methods of summation A, B, 
defined by matrices of real elements (Gmn), (Oma), (m, m = 1, 2,...) which are 
regular, that is, have the three well-known properties of Toeplitz (4, p. 43). 
A method A is said to be core-consistent with the method B for bounded sequences 
if the A-core (3, p. 137; and 4, p. 55) of each real bounded sequence is con- 
tained in its B-core. B is totally included in A, B <A, if each real sequence 
which is B-summable to a definite limit (this limit may be finite or infinite of 
a definite sign) is also A-summable to the same limit. It will be shown in the 
present paper that if the matrix A is core-consistent with the positive matrix B, 
then A is “almost”’ divisible by B on the right. This statement is made precise 
in Theorem 1 below. The proof (§2) involves some elementary properties of 
convex sets in Banach spaces. In §3, the same method is used to prove a similar 
result for the relation B << A (Theorem 2). Some simple corollaries are given 
in §4. 

Let /; be the Banach space of elements x = (x,), with norm 


||x|| = = Xn}, 


so that the rows of the matrices A, B are elements 4a,,, b,, of /;. Elements x, y © /; 
are called disjoint if x,y, = 0(m = 1, 2,...); an element x € /, is positive, x > 0, 


ifx, >O(m = 1,2,...). 1f x = (x1, %2,...,%,..-) € 4, we shall write 
x’ = Sar eee X,, = GA) « 0:5 ¢ Wy My Mets « « ode 
x,° = (0,..., er Sees Pp <q. 


We also use the same notation for sets E C /;, for instance £,* is the set of all 
x,’ with x € E. A cone K C 1]; is a set such that 


>> oeX K 
1 


whenever c, > 0, x, € K. For instance, the set of all positive elements is a 
cone in /;. 
We shall prove the following theorems: 


THEOREM 1. Let A, B be regular matrices and let A be core-consistent with 
B. If B is positive, that is if b, > 0 (m = 1, 2, . . .), there is a positive regular 
matrix C such that the norm of the mth row of CB — A tends to zero form — ©. 

Received December 22, 1952; in revised form May 8, 1953 


97 











28 G. G. LORENTZ AND A. ROBINSON 


The case where the elements of the sequences, or of the matrices, are complex 
is not essentially different as will be shown in §2. 

If A = (G@n,), we shall write A, for the matrix obtained from A by replacing 
all Gn, with n < p by zeros. 


THEOREM 2. If A, Bare regular row-finite matrices, B positive and 


(i) B<A, 
there is an integer p and a regular positive row-finite matrix C such that 
(1) CB, - A,; 


this remains true if (i) is replaced by the (formally weaker) hypothesis that 
(ii) t, — + © always implies \o| — + ~, where o, and rt, are the A- and the 
B- transforms of a sequence s,, respectively. 


If B is the unit matrix J, these results were known before; for the case of 


Theorem 1 see Agnew (1), also (3, p. 149); for Theorem 2, Hurwitz (5) or 
(4, p. 53). 


2. Core-consistency. If Theorem 1 is true for a given pair of matrices 
A, B, it is also true for any two matrices A’, B’ with rows a’,,, b’,, satisfying 
[|an — @n||—> 0, ||bn — bj,|| — 0. 
This and the regularity of A, B imply that we may assume A, B to be row- 


finite, and such that there is a sequence m(m) increasing to + © with day = Dmx 
= 0 for n < n(m). 


LemMMA. In the above conditions there exist two sequences p = p(m) < q(m) 
such that p(m) — ~ for m— @ and that 


(2) p(an, K) = p(Aan, K,'); 


here p(@m, K) is the distance from a,, to the cone K generated by the b,(vA = 
eS 


Proof. For a given m, let m,; < mz be such that b, is disjoint with a,, if u 
does not satisfy m; < 4 < m2; we may assume that m,— © for m— o. 
Let K’ be the cone generated by the b,, mi < u < mz, let p(m) = n(m,) 
and let g be so large that b,, = 0,m: < u < m2, Gm, = 0 for n > q. Then 
Any’ = An, K’,* = K’, and therefore 


(3) p(an, K) < p(am, K’) = p(@m, K's’). 


On the other hand, let x € K, then x is a linear combination, with positive 
coefficients, of some of the b,. If we omit from it all those b, which are not 
b,, we shall obtain another element x’ € K’. The omitted b, are disjoint with 
a, and all d,, satisfy b,, > 0. This implies 


| | ad x,'|| > ||an — x’,"||. 


Since K’,* C K,‘, it follows that p(a,,, K,*) = p(a@mn, K’,*) and using (3) we 
obtain p(a,,, K) < p(an, K,*). The inverse inequality is obvious, and (2) follows. 





wo Oo 


CORE-CONSISTENCY AND TOTAL INCLUSION 29 


Proof of Theorem 1. We shall show that 
(4) p(an, K) — 0. 


If this is not true, there exists by the Lemma an « > 0, a sequence of disjoint 
a,,, and a sequence of disjoint intervals [p,, ¢,] with 


sin, Ee.) > @ 


If 
y= > C (Am, 
is a linear combination of the a,,, with c,; > 0, }-c,; = 1 and if x € K, we can 
put 
n@¢; @,” €&,, 
and have 


iy — xl] = [20 cam, — Do cetell = Do calle, — Zil| > €. 

This shows that the convex set E generated by the a,,, is at a distance > e 
from E, hence the «-neighbourhood E, of £ is disjoint with K. If K, is the 
cone generated by E,, K and K, are disjoint except for the origin. By a well- 
known theorem (7, Theorem 1.2), there is in /; a bounded linear functional 


f(x) of norm one which is positive on K and negative on K,. Hence f(y) < — « 


on E (7, Lemma 1.2). This means that there is a bounded sequence s, with 
> bnr 5» > 0 (m = 1,2,...), 
> am,.»5e<S —€ (« = 1, 


and contradicts the hypothesis of Theorem 1. 
From (4) it follows that for some row-finite positive matrix C = (Cm), 


[lan — >. Cad, || — 0, m— ©. 


to 


Finally, this C will be necessarily regular, provided we agree to take c,,, = 0 
whenever b, = 0. For 


Cma On» < > Cmn On» = Im» + 0(1) = o(1), m— @ 
n=l 


implies that Cn», — 0 for m— © and each n. On the other hand, 


> ans = > > Cmn On» + 0(1) 
} rw) ba» + o(1) 


together with 
> am» = 1+ 0(1), D> dae = 1 + o(1) 


imply that }>¢n, —> 1 for m — «. This completes the proof. 
n 


The concept of the core is defined also for sequences of complex numbers 
(3, p. 137). Accordingly, we may introduce the concept of core-consistency 











30 G. G. LORENTZ AND A. ROBINSON 


as well for matrices and sequences with complex elements. With this new 
definition, Theorem 1 holds literally as before. 

For the proof assume that A is a regular matrix with complex elements, 
B is positive and that A is core-consistent with B for bounded sequences. 
Hence, by Knopp’s core theorem (6, p. 115) or (4, p. 55), the core of the B- 
transform of any bounded sequence s, is included in the core of s,. But A is 
core-consistent with B, and so the core of the A-transform of s, also is included 
in the core of s,. This implies (3, p. 149) that A = A’ + V where A is a positive 
regular matrix, and the norm of the mth row of the matrix V tends to zero as 
m — @. Clearly, A’ also is core-consistent with B, for complex or more particu- 
larly, for real sequences. It then follows from the original Theorem 1 that there 
exists a positive regular matrix C such that the norm of the mth row of CB — A’ 
tends to zero from m — . Consequently, the norm of the mth row of 


CB —-A=CB-—A’'—V 


also tends to zero for m — «. This proves our assertion. 

The converse of this (as well as the converse of Theorem 1) is a direct con- 
sequence of Knopp’s core theorem. Thus, let A, B, C, be three regular matrices, 
C positive, such that the norm of the mth row of CB — A tends to zero for 
m-—» ©. Then the core of the transform of any bounded complex sequence s, 
by CB coincides with the core of the transform of s, by A. The transform of 
Ss, by CB is the transform by C of the transform of s, by B. Hence the core of 
the transform of s, by CB is included in the core of the transform of s, by B, 
by virtue of Knopp’s core theorem. In other words, CB, and hence A, are core- 
consistent with B for bounded sequences. 


3. Total inclusion. We shall now prove Theorem 2, deducing (1) from the 
hypothesis (ii). Let pnp = p(@mp, Ky); we first show that 


(5) Pmp = 0 for all p sufficiently large and m = 1, 2,.... 


Let (5) be false. Since the p,,, decrease for m fixed and increasing p and finally 
become zero, we deduce that for each ~, p,, > 0 for an infinity of m. Now 
Pmp > 0 implies the existence of 5 > 0, « > 0 such that the sphere S in /, with 
center a, and radius 6 does not have common points with the cone K’ generated 
by the points b,, (A = 1, 2, . . .), and by the spheres with radii « around those 
of the b,, (u = 1, 2,...,m) which are not zero. Hence, there is a functional 


f(x) _ ; ¥ XnSny fl = ] in lh, 
generated by a bounded sequence s, with s, = 0 for m < p, such that the 


hyperplane f(x) = 0 separates S and K’ and supports S (by Eidelheit’s theorem, 
(7, Theorem 1.6)). If f(x) > 0 on K’, we have 


t =f(b,,) >« forb,, +0 (u = 1, 
(7, Lemma 1.2) and 


bo 
~ 


0 > om = f(Amp) > — |lf\lb = — 6. 





‘¥ 


CORE-CONSISTENCY AND TOTAL INCLUSION 31 
By fixing « > 0, taking 6 > 0 sufficiently small, and then multiplying the s, 
with a sufficiently large positive number, we obtain the following statement: 


(*) For each m, p with p», > 0 and for any two positive numbers M, 7, 
there is a bounded sequence s, with s, = 0 for m < p such that 


om = >, Onn Sn = — 1, mr = > ban Sn > 0 (A = 1,2,...), 
t > M if 1<u<m and b,, # 0. 


We now define inductively increasing sequences of integers p;, po, . . . , mm, 
mM», .. . and bounded sequences s“® satisfying s,‘° = 0 for n < p,. If 
(1) (i—1) 
ino < 6 pent Ty ss pM 5ccc38 
are already defined, take p,; so large that a,, = 0 for n > p,, uw = my... , my-1, 


then find an m, > m,_, with 


2 Guat” +... FQN <b, eine, K.) >'6 


By (*), there is a bounded sequence s‘" with s,‘” = 0 for n < p, such that 

(6) ) +...¢5s,°) =—-—1, 

(7) > dna Se” > 0 (A = 1,2,...), 
(8) > ha” > if b,», # 0 (so = 1,...,). 


Let 8 = (s,) be the sequence defined by s, = }-, s,°"; for each m this sum has 
only a finite number of terms. Since a,,, and s‘” are disjoint for 7 > i, we have 
by (6), om, = — 1, and by (7) and (8), »~— ©, which contradicts the hypo- 
thesis and proves (5). 

Fixing a p for which (5) holds, we consider an arbitrary m. For each « > 0 
there is an x in K such that 
(9) lamp — Xp|| <«, x=) «b,c, >0. 


Let q be the last index with a,,, # 0. If we omit from the last sum all b, 


for which 
2, ha > 2, he 


n>q n<@ 
we shall obtain an element x’ € K with 
, 
Anp wed xX, < Amp ey X»/ |. 


It follows that u» in (9) may be assumed bounded for all «. Then we must have 


v 
(10) Amp = ) Cuass Cmn > 0 


n=1 
This proves the theorem, for the argument used in the proof of Theorem 1 
shows that C = (c,,,.) is regular, provided in (10) we take c,,, = 0 whenever 


b,, = 0. 











32 G. G. LORENTZ AND A. ROBINSON 


We give some corollaries to Theorem 2, assuming that the matrices A, B 
are regular and row-finite and that B is positive. We compare the following 
relations (for the definition of the core of a possibly unbounded sequence 
see (4, p. 55)): 

(i) BKA. 

(ii) For each sequence s,, t — + © implies that |¢,| ~ + ©. 

(iii) A, = CB, for some p with C > 0. 

(iv) A is core-consistent with B for all real sequences. 

(v) A is core-consistent with B for all complex sequences. 

Then we have: 


THEOREM 3. Conditions (i)-(v) are equivalent. 


Proof. Clearly, (i) — (ii). Theorem 2 shows that (ii) implies (iii) and it is 
easy to see that (iii) — (i). From the definitions of the properties concerned 
we have (v) — (iv) — (ii). Finally, Knopp’s core theorem states that (iii) 
— (v). This completes the proof. 


4. Applications. For further illustration of Theorems 1 and 2 we shall 
give some applications to totally equivalent and core equivalent methods. 
Two methods A, B are totally equivalent, if A « B and B < A; they are core- 
equivalent for bounded sequences if the A-core of each bounded sequence coin- 
cides with its B-core. In what follows, V is a matrix such that the norm of the 
mth row tend to zero for m — , and J is the unit matrix. 


THEOREM 4. (i) A method A is core-equivalent with I for bounded sequences 
if and only if A has a representation 


(11) A=A'+V 
with positive A’, where A’ contains a sequence of rows of the form 
(12) Sa, @ BD... eA Gare -- ade Oe B82... 


(then necessarily m, — ©, Gm... — 1 forn — @.) 
(ii) A regular row-finite method A is totally equivalent with I if and only if 
for some p, A, 1s positive and contains a sequence of rows of the form (12). 


Proof. (i) The conditions are clearly sufficient. It follows from Theorem 1 
that (11) with a positive A’ is necessary. Again by Theorem 1, there is a positive 
regular matrix C and a V’ with CA’ = J + V’. For each n we have 


(13) a CumA'm = Cp 
m=1 
with ¢,; > 0, @, = (€n1), nn — 1 and 
~ Cnt — 0 
lyén 


for n— o. Let 





CORE-CONSISTENCY AND TOTAL INCLUSION 33 


Then ¢, — 0 for n — @. Since the c,, are all positive, it follows from (13) that 
there is at least one m = m, such that 
>) » oom < €n- 
lgtn a mn 
For otherwise, multiplying the relations 
> @'mi > Ex’ mn (m = 1,2,...) 
lyn 
with C,, and adding we would obtain by means of (13) that 
>. Cnt > En€nn ’ 
iyén 
which contradicts the definition of «,. 
We now replace by zero the elements a,,; of the rows of A’ with m = m,, 
1 # n(n = 1,2,...). Denoting the matrix thus obtained again by A’, we see 
that (11) and (12) are satisfied. This proves (i); the proof of (ii) is similar. 


Theorem 4(i) may serve to show, for instance, that if a regular Hausdorff 
method H, is core-equivalent with J for bounded sequences, then H, is identical 
with J. 

A method A is normal if a, = 0 for n > m and a,, ~ 0 (mn = 1,2, ...). In 
this case A has an inverse A~'. If A, B are normal, there is a triangular matrix 
C with A = CB. 


THEoreM 5. Let the regular normal methods A, B be totally equivalent. Then 
there exists a sequence Cm —> 1 such that for some p, 


(14) Onn = Cm Omn; m=1,2,...;n=p,pt+1,.... 


Proof. Let A = CB, B = DA, then the matrices C, D are triangular, 
regular and totally equivalent with J. We have 


Onm = Crm bam, ban = damOmm; 


hence 
Cantu = |, 


and we obtain Cc», — 1. From Theorem 4 (ii) it follows that for all sufficiently 
large 1, Can = Oif m # m. Putting Cy = Cum, we obtain (14). 

It should be added that sometimes it is even possible to prove that A, B are 
identical if they are totally equivalent. Let A = H,, B = H,, be two regular 
and normal Hausdorff methods. Then 


. lana | 


converges for m— @ to the “essential” total variation of g(x). From (14) it 
follows that 


, > ldmn - bnn| — 0, 


n=) 











34 G. G. LORENTZ AND A. ROBINSON 


hence g and g; are essentially identical. Thus we obtain a remark of Bosanquet 
(2, p. 452) that H,, H,, are identical if they are totally equivalent. 


REFERENCES 


1. R. P. Agnew, Cores of complex sequences and of their transforms, Amer. J. Math. 61 (1939), 
178-186. 

2. S. K. Basu, On the total relative strength of the Hilder and Cesdro methods, Proc. London 
Math. Soc. (2), 50 (1949), 447-462. 

3. R. G. Cooke, Infinite matrices and sequence spaces (London, 1950). 

4. G. H. Hardy, Divergent series (Oxford, 1949). 

5. W. A. Hurwitz, Some properties of methods of evaluation of divergent sequences, Proc. London 
Math. Soc. (2), 26 (1926), 231-248. 

6. K. Knopp, Zur Theorie der Limitierungsverfahren. Math. Zeitschrift, 31 (1929-30), pp. 
97-127, 276-305. 

7. M.G. Krein and M. A. Rutman, Linear operators leaving invariant a cone in a Banach space, 
Uspehi Mat. Nauk (N.S.), 3, no 23 (1948), 3-95; Amer. Math. Soc. Translations no. 
26 (1950). 


Wayne University 
and 
University of Toronto 





AN EMBEDDING THEOREM FOR BALANCED 
INCOMPLETE BLOCK DESIGNS 


MARSHALL HALL Jr. anp W. S. CONNOR 


1. Introduction. From a symmetric balanced incomplete block design we 
may construct a derived design by deleting a block and its varieties. But a 
design with the parameters of a derived design may not be embeddable in a 
symmetric design. Bhattacharya (1) has such an example with A =3. When 
\ = 1, the derived design is a finite Euclidean plane and this can always be 
embedded in a corresponding symmetric design which will be a finite projective 
plane. 

In this paper it is shown that for \ = 2 as well as for \ = 1 a design with 
the parameters of a derived design is indeed embeddable in a symmetric design. 
The methods used depend on techniques developed in (2). It is interesting to 
note that for k = 4 the entire embedding was carried out by Nandi (3). 


2. General conditions for embedding. A balanced incomplete block design 
with parameters v, b, r, k, and X satisfying 
(2.1) bk =vr, r(R—1)=(w—1)A 


is symmetric if v = 6. From a symmetric design S, a derived design D’ may 
be obtained by deleting a block By and the varieties of By throughout. This 
leaves a design D’ with parameters v = v — k,b’ = b—1,r =r, k' =k —X, 
and ’ = X. Since r = k in the symmetric design, r’ = k’ + \’ in the derived 
design. Thus, for every derived design D’ the parameters satisfy 


r= k'+N, v'N = kh (k' +N — 1), 
b’ ’ = (Rk + dW’) (R’ + WV — 1). 
It is not difficult to state conditions on a design with appropriate parameters 


so that it is recognizable as a derived design of some symmetric design. These 
conditions follow: 


THEOREM 2.1 A design D with parameters satisfying 
(2.2) r=k+)A, rA=R(R+A—1), DA= (R+AN(R+A—1) 


can be embedded as a derived design in a symmetric design if and only if we can 
find in D sets of blocks S;,j7 = 1,...,% +, such that: 


Received March 19, 1953. Theorem 3.2 was proved independently by both authors. Upon 
learning of this by correspondence they decided to write the present joint paper, which is a 
synthesis of the two original manuscripts. 


35 











36 MARSHALL HALL JR. AND W. S. CONNOR 


(1) Each S, consists of k + % — 1 blocks of D. 

(2) The blocks of an S, together contain each variety of D d times. 

(3) Amy two distinct sets S;, S; have exactly } — 1 blocks in common. 
(4) Any block of D is in precisely sets S;. 


Proof. Let us adjoin to D new varieties x;, ..., x,4, and a new block Bo 
consisting of these varieties. We also adjoin the variety x, to all blocks of the 
set S, and to no other blocks. Then the new array S contains b* = 6 + 1 blocks 
and v* = »v + k + varieties. The block By contains k + \ varieties and by 
(4) we have adjoined just \ new varieties to each old block. Hence, in S each 
block contains k* = k + varieties. Each old variety appeared r = k +) 
times and each new variety x, appears once in By, and in the k + \ — 1 blocks 
of S;. Hence, in S each variety appears k + \ = r* times. Finally, in D each 
pair d,, d, occurred together \ times; by (2) a new x, occurs with each d, of D 
\ times and by (3) a new pair x;, x, occurs together in 4 — 1 old blocks and 
once in the new block By. Thus every pair in S occurs together A times. S is 
seen to be a balanced incomplete block design, and as r* = k*, S is a symmetric 
design. 

Conversely, if we drop from a symmetric design S a block By and all of the 
varieties of By, we may readily verify that the blocks of S containing a sup- 
pressed variety x, form in the derived design D a set of blocks S,, and the sets 
S, have all of the properties mentioned in the theorem. 


3. The embedding theorems for \ = 1 and \ = 2. Not every design D with 
parameters satisfying the relations (2.2) satisfies the embedding conditions. 
Since it is known that any two distinct blocks of a symmetric design intersect 
in \ varieties, a design D cannot possibly be embedded if it has two blocks 
intersecting in more than A varieties. Such an example with A = 3, v = 16, 
b = 24, r = 9, and k = 6 was found by Bhattacharya (1) and is listed in (2). 
In this example there are two blocks with four varieties in common. 

When A = 1, the design D is readily seen to be a finite Euclidean plane and 
it is well known that every such plane can be embedded in a finite projective 
plane, this being the corresponding symmetric design. This known result and 
the absence of a counter-example for 4 = 2 comparable to that for \ = 3 
suggested the embedding theorem for \ = 2 which is the main part of this 
paper. The embedding theorem for \ = 1 will be included here not as a new 
result but as an indication of the general motivation of the more complicated 
embedding theorem for \ = 2. 


THEOREM 3.1 Every design D with parameters v = k?,b = k? +kh,r =k + 1, 
and = 1 satisfies the conditions of Theorem 2.1 and has a unique embedding 
in a symmetric design S. 

Proof. Let B, be an arbitrary block of D whose varieties are a,, a2, ... , a. 
If c is any variety of D not in B,, then there are k blocks containing the pairs 
aig ca, and these are distinct since no a;, a, occur together except in 





BALANCED INCOMPLETE BLOCK DESIGNS 37 


B,. This accounts for k of the k + 1 blocks containing c. Thus there is exactly 
one block B, containing ¢ and no variety of B,. Let the varieties of By be c, 


C2,..-» Cx. Then By is, for each of cs, . . . , ce, equally the unique block containing 
this variety and none from B,. Hence, in all there will be blocks B:, ..., B, 
containing the k? — k varieties of D not in B,, and no one of B,, ..., B, inter- 


sects B, in a variety. Moreover, no two of these B’s have a variety in common 
with each other since each is the unique block for each of its varieties not 
intersecting B,. The blocks B,, ... , B, form a set S, of k blocks which together 
contain each variety once. Moreover, each block determines uniquely such a 
set of k non-intersecting blocks and there will be in all k + 1 sets S;, ..., 
Se+1 and these sets have the properties required by Theorem 2.1. We note 
finally that D determines these sets uniquely and so the embedding is unique 
For \ = 2 the same result holds but the proof is much harder. 


THEOREM 3.2 Every balanced incomplete block design D with parameters 


v= $k(k+1), b= }(k+1)(R+2), r=k+2, A=2 


satisfies the conditions of Theorem 2.1 and can be embedded uniquely in a symmetric 
design. 


The proof given in the following sections will not cover the case k = 6, 
shown to be impossible in (2). Nor will it cover the case k = 2, and so we shall 
always exclude this case without further mention. It is easy to verify by replace- 
ment of the varieties in the blocks that the theorem is true for k = 2. Throughout 
the proof we shall suppose that D exists. 


4. Some relations among the blocks of D. In this section we shall quote 
some lemmas from (2) and shall develop one new lemma, all of which will 
be useful later. 

Let two blocks of D which have u varieties in common be called ‘‘uth associ- 
ates.’ Then we may paraphrase Lemma 4.1 of (2) as follows: 


LemMA 4.1 Any block of D has 2k first associates, 4k(k — 1) second associates, 
and zero sth associates (s ¥ 1, 2). 


Next consider any two initial blocks of D, say B,; and By». Any other block 
of D is of “‘type 1” if it is a second associate of both B, and Bz, of ‘type 2" 
if it is a second associate of one of B, and B, but a first associate of the other, 
and of “‘type 3” if it is a first associate of both B, and B». Now we may paraphrase 
Lemma 4.2 of (2), thus: 

LEMMA 4.2 If two initial blocks of D are first associates, then there are 
5(k — 1)(k — 2) blocks of type 1, 2(k — 1) blocks of type 2, and k blocks of type 
3. If two initial blocks of D are second associates, then there are 4(k — 2)(k — 3) 
blocks of type 1, 4(k — 2) blocks of type 2, and 4 blocks of type 3. 

Next we shall consider the structural matrices which correspond to three 
sets of blocks. In each matrix there is an unknown element, which we shall 











38 MARSHALL HALL JR. AND W. S. CONNOR 


determine under the condition that the corresponding set of blocks forms part 
of D. The first matrix is 


ke1iii12] 

ke1l221 

(4.1) S, = R221 
k Sas 2 

k 2 

| ss 








which corresponds to a set Us, of blocks. We desire to know whether 1 or 2 or 
both are admissible values for s,s; to assume if Us forms a part of D. 

Associated with Ss is the characteristic matrix Cs, which has the elements 
Cyy = 2k and cy, = k — 2 or — 4 (j ¥ u) according as s», = 1 or 2, where j 
and u refer to the jth and uth blocks of Us. Hence, our problem is to decide 
whether k — 2 or — 4 or both are admissible values for c4; to assume. 

We obtain the determinant 


(4.2) ICe| 


4(k + 2)*(2k — cas)[(k — 2)cas + 2(k — 6)] 
4(k + 2)*fi fe, 


where f; = (2k — cas) and fe = (Rk — 2)cas + 2(k — 6). Now by Theorem 3.1 of 
(2) it is necessary that |C,| >0. For any k, 4(k + 2)*>0. Hence, either (a) f; < 0 
and f; < 0, or (b) f: > 0 and f, > 0. Since (a) implies that 2k < c4s, which 
is impossible, we must have (b), whence 


(4.3) Cts > — 2+ 8/(k — 2), 


which has the minimum, — 2. Hence it is necessary that cy = k — 2, and 
therefore that s4; = 1. This result is contained in the following lemma. 


Lemma 4.3 If D contains the blocks of Us, then in Se, sas = 1. 


We shall state two other lemmas, which can be proved by arguments analagous 
to those used in proving Lemma 4.3. Proofs of these lemmas are given in (2). 
Our Lemmas 4.4 and 4.5 correspond respectively to Lemmas 4.4 and 4.5 of (2). 

Consider a set of blocks, Us", which has the structural matrix 


* 38.89 

ke1i1i1 

(4.4) SY? = sa 4 
Rk Sas 

k 


We may state a lemma about 545. 


‘In (2) it was stated that “the equations of 4.6 may be solved to determine the number of 
blocks of types 11, . . . , 32." However, since the six equations 4.6 are dependent, one cannot 
obtain the individual x;;’s but only the needed linear combinations thereof. 





BALANCED INCOMPLETE BLOCK DESIGNS 39 


LemMA 4.4 If D contains the blocks of Us, then in Ss, sag = 1. 
Finally, consider a set of blocks, Us‘, which has the structural matrix 


: 1 2 
- 
2 1 
k S45 


b 


1 1 
k 1 
(4.5) SY = k 


The related lemma follows: 


LEMMA 4.5 If D contains the blocks of Us, then in Ss, sags = 2. 

5. The sets which are determined by a block and its first associates. We 
shall consider an arbitrary block B, of D and its 2k first associates. It will be 
shown that these 2k + 1 blocks are uniquely separable into two sets S; and 
S: of k + 1 blocks which pairwise are first associates and which have only B, 
in common. To show this we shall determine the structural matrix of a certain 
set of k + 2 blocks. 

Let any block be B,. Then, by Lemma 4.1, B,; has 2k first associates: Be, 
B;,..., Bexs:. Let us focus our attention on any one of these, say By. By Lemma 
4.2, regarding B, and B, as initial blocks, there are 4(k — 1)(k — 2) blocks 
of type 1, 2(k — 1) blocks of type 2, and & blocks of type 3 among the remaining 
blocks of D. So without loss of generality we may write down two rows of the 
structural matrix S, of D as follows: 


6.1) kl 1) t...1/1...1]2...2/2...2 
4“: Site as ee ls ee oe 
where the columns correspond in left-to-right order to blocks B,, B2, ..., By. 


For convenience we have partitioned S, in left-to-right order into submatrices 
A, B, C, D, E, where A contains 3 columns, D contains $(k — 1)(k — 2) columns, 
and B, C, E each contain k — 1 columns. 

Let the blocks B;, ..., Byy2 comprise a set M, and the blocks B,,;, ..., 
Bx,.4, comprise a set N. We shall show that there exists one block of M which 
is a first associate of two or more blocks of NV. To show this we shall count the 
number of ones in the submatrix C* which consists of rows 3, ..., & + 2 of C. 
Since every block of N is a first associate of B,, it follows from Lemma 4.2 
and the observation that there are k — 1 twos in row 2 of C that there are 
(k — 1)(k — 2) twos and hence 2(k — 1) ones in C*. But there are & rows in 
C*, and so there is at least one row of C*, say the row corresponding to Bs, 
which contains two or more ones. 

Now consider how the third row of S, may be filled up. By considering B, and 
B; as initial blocks, Lemma 4.2 applies. Also, by considering B, and B; as 
initial blocks, the lemma applies. These considerations do not fully determine 
the third row of S,, but do exclude all possibilities except the following. If there 











40 MARSHALL HALL JR. AND W. S. CONNOR 


are j twos in row 3 of C (j = 0,..., & — 3), then there are k — j — 1 twos in 
row 3 of B, (k — 1)(k — 2) — j twos in row 3 of D, and j twos in row 3 of E. 

Now consider S,,2, the structural matrix of the blocks which correspond 
to the columns of A, the j columns of C which have 2 in row 3, and the k — j — | 
columns of E which have 1 in row 3, i.e., 


NN = 





1 | 
a | 
(5.2) Sse k | 





H 
where F and H have & in the main diagonal, but the other elements of F, G, and 
H are so far unknown but will now be determined. Comparisons of the structure 
of S,.2 with the structures of Ss, S;“, Ss°2 shows that Lemmas 4.3, 4.4, and 4.5 
apply, and hence the non-diagonal elements of F and H are 1, and the elements 
of G are 2. 

Corresponding to S,;2 is the characteristic matrix C,;2, which has 2k in the 
main diagonal and k — 2 or — 4 elsewhere, according as S;2 has 1 or 2 in the 
corresponding position. Calculating |C,,.|, which is readily done with the 
help of Lemma 3.1 of (2), we obtain 


(5.3) \Ces2] = 7(j — k + 2)(k — 6)(k + 2)*"’. 


Now by Theorem 3.1 of (2), |C.42! = 0. From (5.3), noting that 0 <j <k—3 
and that k+2>6-—v=k+1, it follows that | Co2l = 0 when andonly 
when j = 0 or k = 6. The case k = 6 was disposed of in (2). 

Let j = 0. Then by Lemma 4.4, the blocks of N pairwise are first associates, 
and the blocks of M other than B; pairwise are first associates. Thus B, and its 
2k first associates uniquely determine two sets of k + 1 blocks which pairwise 
are first associates. The sets are S,(B;, Bs, By, ..., B,42) and S2(B,, Bs, Byss, 
. . - » Boxy). We summarize in the following lemma. 


LemMA 5.1 Any block B, and its 2k first associates uniquely comprise two 
sets S, and Sz of k + 1 blocks which pairwise are first associates and have only 
one common element, B,. 


6. Conclusion. We have shown that there exist sets S, which satisfy (1) 
and (4) of Theorem 2.1. To show that there are k + 2 sets S, in all, we observe 
that since every block occurs in precisely two sets S,, and each set contains 
k + 1 blocks, the number 2 of sets satisfies 2b = n(k + 1), whence n = k + 2. 
Further, because any one of the b = }(k + 1)(& + 2) blocks is the unique 
block in common to two unordered sets S;, S;, each of the unordered pairs of 
sets will have a different block in common and thus any two sets will have 
one and only one block in common. Thus, the sets S, satisfy (3) of Theorem 2.1. 

To prove (2) of Theorem 2.1, let m, denote the number of treatments which 
are replicated 7 times in an S,. Then the following relations are necessary: 





non OO 


BALANCED INCOMPLETE BLOCK DESIGNS 4] 


k+1 


> m=v= sk(k + 1), 


t=O 


k+1 
ot) > im, = k(k + 1), 
t=O 
k+l 


yo i(i — 1)m, = k(k + 1), 


t=O 
where the last relation arises because every two blocks of S, are first associates. 
Now consider the function 


k+l 


Q(i) = de mii — 2)’. 


By (6.1), Q(i) = 0, which implies that i = 2, since m,>0 (i =0,1,.. 
k + 1) and 


k+1 


> m, > 0. 


i=0 


This completes the proof of all properties of the sets S, required for Theorem 
2.1 and incidentally their uniqueness and so in turn the uniqueness of the 


embedding. 


REFERENCES 


1. K. N. Bhattacharya, A new balanced incomplete block design, Science and Culture, 9 (1944), 
508. 

2. W. S. Connor, Jr., On the structure of balanced incomplete block designs, Ann. Math. Stat., 
23 (1952), 57-71. 

3. H. K. Nandi, Enumeration of non-isomorphic solutions of balanced incomplete block designs, 
Sankhya, 7 (1946), 305-312. 


Ohio State University 
National Bureau of Standards, Washington, D.C. 











COMPLETENESS OF ORDER STATISTICS 
D. A. S. FRASER 


1. Introduction. Under the non-parametric assumption that a set of 
observations is a sample from an absolutely continuous distribution, the order 
statistics are known to form a complete sufficient statistic. It is proved in this 
note that it suffices to have the class of uniform distributions over finite numbers 
of intervals or the class of uniform distributions over sets of a ring which is a 
basis for the o-algebra of Borel sets. This result is derived as a particular case 
of that of several samples from more general distributions. 


2. Formulation and statement of results. Let %, Y% stand for a measurable 
space: ¥ is an arbitrary space and Y& is a o-algebra of subsets A C %. A class 
% = {B} of subsets of ¥ will be called a basis of A and written 8 = A(H) if 
% is a ring and if W is the o-ring generated by B. 

Distributions over the space ¥ will be given in terms of a finite measure yu 
over Uf by means of a density function f(x); that is, 


P(A) = f falx) dul), 


and we are thus restricted to distributions which are absolutely continuous 
with respect to u(x). 

To describe a sample of m from a distribution P,, we envisage the product 
space ¥" with the o-algebra generated by %{" and the power product measure 
u”". Then, for C a measurable subset of %", 


n n 
(2.1) PS(C) = JS Tl seoll du(x;) 
and the distributions are given by {II f,(x,)}. 

A statistic based on a sample of m is a measurable function g(x, . . . , X,). 
A statistic of interest will be O(x,, . .. , x.) = {x1, ..., X,}, that is, the set of 
x’s without regard for the order in which the x’s occur in the function O(x,, .. . , 
Xn). Thus O(x, . . . , X,) is invariant under the group of permutations of the 
x’s and is in fact the maximal invariant function under such transformations. 
Any statistic g(x., . . . , x,) which can be written as a function of O(x;, .. . , x») 


is a symmetric function of the x’s, and conversely. 
We define a complete class of measures. {v,(C)|n € Q} is a complete class if 


(2.2) fie) d(y) =0 


= ill 


Received May 4, 1953. 





COMPLETENESS OF ORDER STATISTICS 43 


implies that g(y) = 0 almost everywhere with respect to the measures v,. In this 
note we prove that the class of distributions of O(x:, . . . , x,) induced by the 
distributions {P,"|7 € Q} is complete subject to conditions on Q. 





3. Derivation of results. 


THEOREM. The distributions of O(x1, ... , X,) induced by the distributions 
[P,"\n € Q} over ¥" form a complete class if u(x) is non-atomic and if {fn € Q} 
consists of uniform distributions over the sets B of a basis BA). 


Proof. The density function f,(x) of a uniform distribution will have the 
simple form 


(3.1) fa(x) = c(n) o,(x), 


where ¢,(x) is the characteristic function of a set B, € A(M) and c(n) is a 
normalizing constant such that 


1 = c(n) fix dule) = c(n) u(B,). 


To show that the distributions of O(x,,... , ; x,) form a complete class, we show 
that any measurable function of O(x;, . . . , x,) satisfying (2.2) is zero almost 
everywhere yu”. However, such a function is necessarily a symmetric function 
h(x, ...,%,) of the x’s and hence (2.2) gives 
(3.2) i, Dist, «+ « Me) I] fa(«0) II du(x,) = 0. 
Since c(n) # 0, we have 
(3.3) ™ h(x, ...,%n) [| du(x,) = 0. 
1 
Let B,,..., B, be any m disjoint sets belonging to A(M). Since A(M) is a 


ring, any sum of B’s will belong to A(M) and therefore 


(3.4) UB.) 


If we define 


I(4;, coos In) = fos x B, Pe» eee x) TI dyu(x,), 


1 


then the symmetry of A implies that J is symmetric. From (3.4) with r = I 
we obtain 


FG@<s.,1) @@ 7G. ..,3) @ Be 
and with r = 2, 
2 2 
> .» DW pe ees i,) = 0 
ji=l ja=l 


Then by subtraction, 











44 D. A. S. FRASER 


2 T(j,---, jn) = 0 


where a(1,2) is all m-tuples (j:, . . . , j,) containing both and only 1's and 2's. 
Proceeding inductively, we obtain finally 


7 TG» + + sda) = 0 
where a(1,... , #) is the set of permutations of (1, ... , 2). From the symmetry 
of J it follows that 
Bibs acs OG, 
that is, 


(3.4) ‘. ott p,m es ‘ IT du(x,) = 0. 


Since yu is non-atomic it follows from Halmos (1) that ¥ can be divided into 
a finite number of disjoint sets {S,°} each of which has u-measure less than or 
equal to «. Consequently the diagonal space of %", 
D(X") - { (x, eee » Xn) (34, 7 >) Xi = XH t # j} 


can for any positive 6 be enclosed in a set E;(%") having u”"-measure less than 
or equal to 5. For example, 


E;(%") - U {S;‘ x S; x rs. 


where « is chosen sufficiently small ( = ¢«(5)) and where the union is taken 
over 7 and over the image sets under permutations P of the coordinates. 
Since the sets (A) form a basis for A, the sets {B, X .. . X B,} with dis- 
joint B’s belonging to B(M) form a basis for the o-algebra 
WM," = (CO (X" — E,(¥"))|C € MW}. 


By extending the signed measure (3.4), we obtain 


Ser h(x, ... »%) TT du(x,) = 0, 


where C* € &,". The Radon-Nikodym theorem then establishes that h = 0 
almost everywhere uw" in ¥" — E;(%"). Since 6 can be arbitrarily small, the 
theorem follows. 


COROLLARY. The distributions of O(x,...,: Mn)s => +> OlNn, .- ++ Vu) tH- 
duced by the distributions {P,." X...X Pi"\(n,...,§& €QX%... XZ} over 
x" x... x WY form a complete class if P,, . . . , P; satisfy the conditions in 


the Theorem. 
The proof is a straight extension of that for the Theorem. 


Example. Consider the space ¥ = {(z, y)} = ] —@,0[*X ]0, ~[ and take 
as a basis for the o-algebra of Borel sets the class of sets consisting of finite 





COMPLETENESS OF ORDER STATISTICS 45 


unions of rectangles. If we take as probability measures the uniform distributions 
over sets of the basis above, then for a sample of m the theorem gives the com- 
pleteness of the probability measures of the statistic 


{ (Z1, ¥1),- ~~» (Sar Yn) }- 


REFERENCE 


1. P. Halmos, The range of a vector measure, Bull. Amer. Math. Soc., 54 (1948), 416-421 


University of Toronto 











NON-PARAMETRIC THEORY: SCALE AND LOCATION 
PARAMETERS 


D. A. S. FRASER 


1. Summary. In §2 a result in measure theory is obtained. The remainder 
of this paper, §3 to §11, contains results in the branch of statistics called non- 
parametric theory; these results in part are based on the measure result of §2. 

The measure result concerns a class of probability distributions—those 
distributions having a probability density function on the real line and for 
which a fraction p of the probability is on the negative axis and a fraction 
q = 1 — pison the positive axis. Corresponding to a sample of m the functional 
form is obtained for a statistic having expectation zero for all distributions in 
the class; such a statistic is referred to as an unbiased estimate of zero. 

In §3 a reasonable definition of location parameter for the continuous dis- 
tributions on the real line requires it to be the p-percentile, that is, the point 
having a total probability p to the left of it. In §4 confidence regions for this 
parameter are characterized, confidence bounds are shown to be based on 
order statistics, and confidence regions with certain optimum properties are 
obtained. In §5 several problems in hypothesis testing on the location parameter 
are considered and most powerful and most: powerful unbiased tests are ob- 
tained. A bivariate analogue of one of these problems is considered in §6. 

In §7 reasonable definitions are considered for scale and location-scale para- 
meters for continuous distributions on the real line. For the scale parameter a 
result of negative nature is obtained in §8: that similar texts do not exist for 
the hypothesis that specifies a value for the scale parameter. 

In §9 a formulation is given for non-parametric tolerance regions. A particular 
type of these, distribution-free upper tolerance bounds, was treated by Robbins 
in 1944. His condition, obtained under an assumption of continuity, is shown 
to be necessary but not sufficient in the general case; a bound chooses the order 
statistics with fixed but arbitrary probabilities. 

In $10 some results in estimation theory obtained by Lehmann and Scheffé 
are extended to permit wider application in non-parametric theory. Two 
examples of estimation in non-parametric theory are considered in $11. 


2. A measure problem with applications in statistics. Some results will 
be obtained for probability distributions over R". First we define some classes 
of measures on the real line R'. Let § be the class of probability measures on 
R!, §o be the subclass of distributions absolutely continuous with respect to 
Lebesgue measure, §o(p) the subclass of > whose elements have F(0) = p, 
1 be the class of discrete distributions with probability at a finite number 


Received November 5, 1952; in revised form May 5, 1953. 


46 





SCALE AND LOCATION PARAMETERS 47 


of points, §2 be the class of uniform distributions over a finite number of 
intervals, and §; be the class having a probability density of the form 


c(0;,...,0,) exp {—x™" — > 6x"). 


From these distributions we derive measures over R", the power product 
measure induced by a measure or distribution on R'. Letting F(x, ... , x,) be 
the distribution function obtained from F(x), then 


F(x:,...,%,) = [] F(x,). 


We designate by §," the class of measures over R" which is obtained from §,: 
* : - 
s7 = ‘ I] F(x,)|F(x) € Oif- 
1 


To give an outline of some previous results concerning the classes §;", we 
need the concept of a complete class of measures. Let u»(A) be a probability 
measure over a space ¥ with a o-algebra of subsets Wf; that is, us satisfies 

(1) we(A) > 0, AE 

(2) we(X) = 1. 

(3) If A,€ A and A,(\ A; = (i # 7), then 


AA a) = Eins 


The class of measures {u0(A)|@ € Q} is complete if Sf (x)duo(x) = 0 for all 6 
implies f(x) = 0 almost everywhere {yo(A)}. 
In non-parametric theory applied to distributions over R", the order statistics 


play an important role; we define a statistic T(x,, ...,%X,) = (Xq@,.--,X@), 
the “order statistics,’’ where x,1), . . . , X@) are the numbers x, . . . , x, arranged 
in order of magnitude. Obviously any function of x;, . . . , x, which can be 
expressed as a function of T(x:, ..., x,) is a symmetric function. Corresponding 
to any distribution over R" the statistic T(x, . . . , x,) will have an induced 
probability distribution. 

In 1946, Halmos (1) showed that the distributions of T(x, . .. , x,) cor- 


responding to §:" were complete. Lehmann in (2) proved a similar result for 
#3". In (3) the author showed the same for §.". The distributions of T(x, 
. » X,) corresponding to ¥o(p) are, however, not complete (unless p = 0, 1); 
we prove here some results for §»(~) which are the natural extensions of the 
concept of completeness. 
A statistic $(x1, . . . X,) is an unbiased estimate of a real valued function g(F) 
of the distributions F(x) of a class © if 


f o(x1,...,%») [] dF(x,) = g(F) 
R* 
for all distributions F € ©. 


Using this definition we have the 











48 D. A. S. FRASER 


THEOREM 2.1. For the distributions §o(p) a necessary and sufficient condition 


that a function of T(x1,...3 X,) be an unbiased estimate of zero is that it have the 
form 
(2.1) o(x1,..-,3 ve) = Do alxso(xr,..-., Bente Bente» + > » Ma) 
almost everywhere where W(x, . . . X»-1) 1s an arbitrary bounded measurable sym- 
metric function, and 
(2.2) a(x) = —qg= — (1 — p) ifx <0, 
=+p)p ifx > 0. 
Note: The theorem gives the form of a function $(x, . . . , X,) satisfying 
(2.3) f o(x1,...,%a) [] dF(x,) =0 
R* 


for all F(x) € §o(p). If we relax our requirement of absolute continuity and 
consider the class §(p) of all distributions on R' having F(0) = p = F(0 — 0), 
then the only change is that (2.1) is required to hold everywhere. The proof is 
obtained by trivial changes in the lemmas. 


Proof. We first note that $(x,, . . . , x) is bounded almost everywhere. 
Otherwise there would exist a sequence of numbers ¢;, ¢2, . . . approaching © 
and sets Si, Ss, so% 


S; = {(x1,...,%n) | |o(x1,...5%n)| > ca} 


such that each has positive Lebesgue measure. For any such set it is possible 
to obtain a rectangular set which is more than }, say, filled (Lebesgue) with 
points of S, On the basis of the sequence of rectangular sets it is possible to 
define a density function for which E{|¢|} would not exist. The Theorem 
assumes that all expectations exist equal to zero; hence a contradiction. 

The proof proper then obtains from the following three lemmas. 


Lemma 2.1. If o(x:, . ~~. , X,) iS @ symmetric unbiased estimate of zero for 
o(p), then almost everywhere (21, . . - 2, Vi, -- +» yn.) € J— © , Of X JO, @ [* 
(2.4) > > 2g’ o(x:,...,%) = 0, 

= on Fite 


where the summation with subscript x = 2"-"y’ is taken over the (;) terms obtained 
by replacing r x's with y's and n — r x's with 2's. 


Proof. in (3) a complete sufficient statistic was given for a sample of n 
from an arbitrary bivariate distribution over ]— ~, 0[ X ]0, @[. Letting (2:, y:), 
. . +» (Sn» Ya) be the sample elements, the statistic is {(z:, v1), -.., (Zn. ¥n)}- 
Then if (21, y1, . - - » 2x» Ya) is an unbiased estimate of zero for these dis- 
tributions the symmetrized form of ¢, 


_ 1 
(2:1, ¥1 9s ee Zany Yn) = => $(2:,, Visv- ++ 9 Stes Vin) 





SCALE AND LOCATION PARAMETERS 49 


where the summation is over all permutations (i), .. . , i,) of (1, .. . »), will 
be zero almost everywhere. 

Let f(z, y) be an arbitrary probability density function over ]— ©, 0[X]0, @[, 
and f_(z) and f,(y) be respectively the z and y marginal densities. If, outside the 
present range of definition of _ ), f+(y) on the real line, we give them the 


value zero, then g(x) = pf_(x) + qfx(x) is the density of a distribution be- 
longing to §o(p). 
Now if $(x1, . . . , X,) is a symmetric unbiased estimate of zero for o(p), 


then for g(x) defined above we have 


# he os wet tn) T] g(x) TT ax, 
> (”) Ex a f. ~ ¢ COPaiie so a Zn) 
sols (OTT dy] dz, 
>» ot \p m4 J. Our «+ 2s Per Sette sss m) TT se. 90 TI] as.T] dy, 
S (* ati 62 Vey Srpty «+ « TT f(z, wT] as.T] dy. 


But from (3) we have that the symmetrized form of the integrand is zero 
almost everywhere. This completes the proof. 


0 


Il 
6 
= 


II 


LemMA 2.2. If for all (21, . ~~, 2ny Viy +++ 5 Wn) € J— @, Of" X JO, @[*, 
(2.5) S o(x1,...,%_) = 0, 
z—Z.y 


where the summation is over all terms obtained by replacing each x by either z 


or y, then 
(2.6) o(x1,...,%) = 7 a’ (x5) Wala, . . . » Sent» Sant «so! Xn) 
where 
a’(x)= +1 ifx > 0, 
=—] its < 0. 


Proof. The proof is obtained by induction. For m = 1 the lemma is obvious; 
assume it holds for nm — 1. From (2.5) we have 


> (21, X2,...,%,) = + (2, Bin nc 0p Me) 
— D> O(y1, x2, .. - , Xe) 
= > o(y1, Ra « « oo Mie 


ll 


Then : 
7. [@(z1, X2,...,%n) + O(n, X2,...,%_)] = 0, 


r=z.y 


Zz [o(y1, X2,.-+»%n) — O(n, X2,...,%_)] = 0. 











50 D. A. S. FRASER 


But by the inductive argument, 


(21, X2, 78 Xn) + o(y, Xa, - 225 Xn) 


= a’ (x2) Wo(21, Xs, .- - , Xn) + a (Xs) Wa (21, X2, X4,.--, Xn) +..., 
o(y1, X2,.-- » Xu) = (yn Xa,--- » Xn) 
= a’ (x2) P2(y1, Se. 2.» Xu) + a’ (x3) a(n, Bes Bly 's 20 6 Me) FH oc 
These two equations together imply (2.6). 
If in addition $(x,, . . . , X,) is symmetric, then the y functions can be the 
same: 
Wi(%1,...,: Ses) = W(x, .. . » Se~i)- 


Also the symmetrized y function is uniquely determined. For suppose we have 
two determinations: 


e 
—_ 
tad 
- 

2 
— 
Il 


> a’ (x;) Oita, « « « Mend Booms < » Xn) 
* 
} * Gre) O Gea, . « « ¢ Micte Dinde « « + p Me 


By subtraction, 
(2.7) 0 = > a’ (x;) hs ic ated Rents Beets « «+ ¢ Mad 


We now prove by induction that ¢ is identically zero. For n = 1, the statement 
is obvious; assume it holds for m — 1. We have 


am a’ (x) E (xe, cons Xn) = a’ (x2) E(x, + > ee x. _ + cee 


+ a’ (X_) E(x, Xo,--- » Xe—1), 
and the left hand side is independent of x; (x; > 0 or x; < 0). The assumption 
for n — 1 implies that the right hand side term by term is independent of 
x1 (x; > 0 or x; < 0). Also when x; changes sign, so does the left side and hence 
the right side term by term. Thus we have 

E(x1,..- ,Xn-1) = a’ (x1) E* (x2, +e» Sut) 


and from symmetry 


n—1 
E(x1,...,%n-1) = I] a’ (x). 


Substitution in (2.7) then gives &* = 0, E(x, . .. , x,) = 0, and therefore the 
uniqueness of ¥(x1, . . . , X,—-1) in the symmetric case. 
LemMA 2.3. If (x1, ..., %n) is symmetric and satisfies (2.4), then 


(x1, .. . , Xn) has the form (2.1). 


Proof. Letting i(x:, . . . , x,) be the number of positive x’s and defining 
@' (x1, ..., X_) by 
¢’ (x:, or Xn) = ’ See Zn) ” Zn) (x1, we Xn), 


then (2.4) becomes 





SCALE AND LOCATION PARAMETERS 51 


(2.8) > ¢(x1,...,%) = 0. 


r=Z.y 


This last relation need only hold almost everywhere. However, if we replace 
(x1, ...,%,) by its average over a rectangular cube with sides 4;, . . . , 6, centred 
on (x:,..., %,), then (2.8) holds everywhere (points distant n! max 6, from 
the diagonal planes excepted). Lemma 2.2 then gives 


(2.9) ¢' (x1, oe y) = a’ (x1) Vi (X2, occp Su) +... +a (Ke) Plea, .. . » Ses). 


Since the y's are just linear combinations of values of ¢’s, then by the Radon- 
Nikodym theorem the above form for ¢’(x:, . . . , x,) holds almost everywhere 
as well as on the average as obtained. From the symmetry of ¢ we may have 
¥; = ¥ independent of i. 

Substituting for ¢’ in terms of ¢, using a(x), and appropriately defining y, 
(2.9) becomes (2.1). 


3. Definition of a location parameter. In sampling from a probability 
distribution over the real line the statistician, assuming a non-parametric 
hypothesis, will usually envisage a class of distributions as general as %o: Fe 
is the class of distributions having a density function on the real line. For this 
class of distributions we consider what real parameters can legitimately be 
called location parameters. 

By a real valued parameter for a clas« of distributions © is meant a real 
number for each F € G; that is, a real valued function £(F) defined over G. 
A reasonable requirement for §(F) to be called a location parameter might be 
given by the 


DEFINITION. é(J*) is a location parameter if, for any F,G € G for which 
F(x) = G(f(x)) where f(x) is monotone nondecreasing, 


&(G) = f(&(F)). 


The meaning of this definition is more apparent in terms of random variables. 
Let X be a random variable with distribution F(x); then Y = f(X) has dis- 
tribution G(y). The condition is that the value of the location parameter also 
be transformed by the function f(x). Many location parameters in parametric 
problems satisfy this condition. 

As an immediate consequence of this definition, we obtain that £(F) is a 
percentile of the distribution and that there is a number p such that F(E(F)) = p 
(with the obvious modification if F is not continuous). For if we assume that 
® contains a continuous distribution G, then for any F € @ there is a mono- 
tone non-decreasing function f(x) such that G(x) = F(fr(x)). The definition 
gives §(F) = fr(&(G)). This uniquely determines ¢(F) and we have 


F(&(F)) = F(fr(&(G))) = GEEG)) = 2, 
where p is the constant G((G)). 











52 D. A. S. FRASER 


Restricting ourselves to percentiles as the reasonable non-parametric location 
parameter, we define the p-percentile by 


t,(F) = F-“(p). 


This definition is not always unique. For if there is an interval over which 
F(x) is constant at the value p, then the definition gives all the points of that 
interval. However this is not a real drawback. 


4. Confidence regions for the location parameter. From the results 
of Theorem 2.1, it is possible to characterize similar 8 confidence regions for 
the location parameter £,(F). 


Let S(x,, . . . X,; R) be a set on the real line for each (x;, ... , x,; R) where 
&<4 2 < 1. Ths Sim, «.- 2! x,; R) is a mapping from R" X [0,1] into the space 
of subsets of R'. We require that the characteristic function, 

@o(x1,...,%;R) = 1 xO € Sim... ,%ai QR), 
= 0 a0 € Siss,.... He: R), 
should be measurable in (x, .. . , x,; R) for each 6. We say that S(x;,...,: X,; R) 


is a B randomized confidence region for the parameter £,(F) if 
Pre(&(F) € S(Xi,...,Xai R)) > B 


for all F € &o where (Xi, ...,X,) is a sample of n from the distribution F and R 
is assumed to be uniformly distributed on {0,1). Similarly S(x;, . . . x,; R) is a 
similar 8 randomized confidence region for £,(F) if the inequality with B is replaced 
by equality. 
We investigate similar 8 confidence regions. For a confidence region S(x;, 
. » X,; R), it is possible to define a characteristic function by eliminating 
the dependence on R: 


1 
do(X1, cee «fie? = he Prp{@ € S(x:,, eee » xX; R)} 


where the summation is over the m! permutations (2), ... , %) of (1,..., ). It 
is easily seen that ¢» is the conditional expectation of ®» given values for x,1), 
..-»Xq@) and that it is symmetric in x1, ... , Xp. 


If S(x1, . . . , X»; R) is a similar 8 confidence region for £,(F) then 
Er(o0(Xi,...,Xn)) = B 
for all F € §o having &,(F) = 6. Thus we obtain 
Er(oo(X1,..-, X,) — 8) =0 


when &(F) = @. 

For simplicity consider the case @ = 0; then the condition on ¢ is that 
be(X%1,---5! X,) — 8 be an unbiased estimate of zero for %o(p). By Theorem 2.1 
we have 





SCALE AND LOCATION PARAMETERS 53 


oo(X1,...,%,) —B= ; eels) Olas, . . . Sante Kesty «2 0 gf Xn) a.e. 
and similarly 
oo(X1,...,%,) —~ B= } a(x, — 0) Po(xi,...,3 =e errr a.e. 


Thus in part we have 


THEOREM 4.1. A mecessary and sufficient condition that S(x,, ... , %,; R) 
be a similar B confidence region for ,(F) is that 


(4.1) @e(xi,...,%.) — B 


ll 


Pve(xi, ée06 Xin) + see a Pve(x), oc 0 9p By ») if 0 < X«1) 


— qv¥o(x:2), see » Xm) een gve(xq, oe y M(g—t)y M44 yy ~~ + » Xemy) 
+ Phe(x, - ~~. Xo, X42) +--+, Xe) H--e + PWo(xi, - ~~ » Xa) 
if X(» <O< X(4+1) 


= — qWo(Xi2),...,X%m@) —.-. - q¥e(x, +++» X@m—1)) of Xiny < 0. 


Proof. The necessity is proved above; the sufficiency follows immediately 
from Theorem 2.1. 

It is interesting to note that for any @e(x;, . . . , x,) of the form (4.1) with 
values restricted to [0,1] there exists a confidence region with ¢¢(x:, .. . , : Xn) 
as characteristic function. Let 


S(xa, ... Sup R) = 10 | Golxr,...,%n) > Ri; 
then it is easily seen that S(x, . . . x,; R) is a similar 8 confidence region for 
t,(F). 
The above theorem allows us to determine the form of upper confidence 


bounds, and the next theorem shows that such bounds are necessarily a choice 
of the order statistics. 


THEOREM 4.2. If S(xi,...,%;R) =] — ©, u(x, ..., X,; R)[ ts a similar 
confidence region for &,(F), then 


(4.2) i(x:,...,%;R) = —© with probability po(x:,...,: X_) 

=X) with probability pi(x1,..., Xn) 

=X) with probability p,(x1,...,: a) 

= +o with probability Pasi(x1,...,: Xn) 
where ti(x;, .. . , X,; R) is symmetric in x,, . . . , X, and is obtained from 
u(xi,...,%Xn;R) by incorporating into R the randomization of the n! permutations 


(i;,... 5%) of (1,...,m), and where > p, = 1, p; > 0, and there is a set of 
functions {Po(x1,...,: Landy os op Smt Xie «> + 9 Sa~a)} Suck that 








54 D. A. S. FRASER 


i-1 
Pi(x1, ccc he) = = P (x, e+ +» Mt—t)y M44), - - - y S Xin) ) 
0) 
n—t 
+ p>. P(x peeegd Mihy + +» Mee g—t)> Mi 4 + + 2 X(n)) 
j= 
i-1 
= q>. Pii(xq,..- » X(j—1)y Me gp ty s+ + y XC ~ ~~ » Mey), 
j=1 


and conversely. 


Proof. The randomization inherent in the ordering of the x,'s can without 
loss of generality be combined with that provided by the random element R; 
thus we assume that S(x,,..., x,; R) and u(x, . . . x,; R) are symmetric in the 
x's. 

By examining its definition we see that the characteristic function of an 


upper confidence bound, ¢e(x:, . . . , X,), is a monotone non-increasing function 
of @. From Theorem 4.1, de(x1, ... ,%,) — B = Goe(x1, . . 
where the form 


(4.3) Go(xi, ..., Xn) 


. » X,) has almost every- 


= Pvo(x,2), tee X(n)) +...+ bio(x, occ» Sq) uf 0 SX 
== q¥o(x:2), eT) a ie qve(x, wey BCG—1)s M44 Dy + + + » Meny) 
a Pve(xa) ee een Mathys (442), +--+»! x (n)) + oe + Pve(xa ee Xin- ») 


if Xa) <8 < Xn 

= « qveo(x.2), eces Xin) wsoe ™ qve(x.1), o 0 « 9» S&q~!1)) tf Xin) < 6. 
Consider two values of 0, say 6, 02 (0; < 62), and 2m values of x, say 2,..., 
Zny Vin ++ 5 Vn (21S... SK Bq << Oy < Og < ys <<... < y,) and assume that any 


@ using the @’s and x’s from these sets is of the form (4.3). (The average over 


any small cube centred over the particular points in R" will always satisfy 
(4.3).) Writing 


@ (x1, ~ 229 Xn) = Go, (X1,.--, Xn) — Go,(X1,--- 5» Xn), 
v" (x, 0c pie) @ Ga, Ge, ...sSe) — GaGa...» Ke) 
we have from the monotonicity of ¢» that $*(x,, . . . , X,) > 0. Then for r 2's, 
Say 21,...,2,,and m — r y's, SAY Yr41,--- » Ya, We have 
(4.4) pv (21, ee ere |. ee | pv’ (21, ee ee Yn) 
— qv (21, Se eee, 3 eee qv (22, +s «pie Meeks. Se ae 
Thus, for any set of m numbers chosen from 2, . . . , Zn, Y1, - - - » Yn We write down 


the m values of the ¥* function by deleting successively one of the m numbers. 
To each of these we attach the coefficient + » or — g according as the deleted 
number is a y or a z. From the hypotheses it follows that the algebraic sum 
is always nonnegative. Thus there are (") inequalities and (71) ¥* values. 





SCALE AND LOCATION PARAMETERS 55 


We proceed to show that the ¢* values are all zero. Consider (,",) vectors 
each with () coordinates. For each set of » — 1 numbers chosen from the 
2n, we define a vector with a coordinate corresponding to each set of m numbers 
which can be chosen from the 2m; the value of this coordinate is zero if the 
n — 1 numbers giving the vector are not included in the m numbers giving the 
coordinate, and is + p, — q if the m — 1 vector numbers are included in the 
n coordinate numbers and if the additional number is a y, z. The inequalities 
(4.4) then say that a linear combination with weights ¥* of these vectors gives 
a vector in or on the boundry of the first orthant. We wish to show that the 
only such combination is a combination with zero coefficients and this of course 
gives the zero vector (zero ¢* values). 


We now define a vector which has coordinates c(x;, . . . , x,) all positive and 
is orthogonal to each of the (,":) vectors defined above. This orthogonality 
condition for the vector corresponding to (21, . . . , Z-, Vra2,-.-+ Ya) is 
pc(z:, e099 Sry Vr+ks “ees Yn) + “*#e + pe(z, 09 Sey Yu Vr+2 “9 Yn) 

—QC(S1, . . . » Bry Sar Vrads «+ +s Ya) — - ++ — QC(Sy,..~ 5 Sraty Vrat,- ++ Ma) = O. 
Defining c(x;, .. . , X,) by 


1 
n 
= . n— i(z, Ta) i(z, Ta) 
clen....0) =(, ) ¢ : 
. $(x3,...,53 Xn ) P d 


the above equation becomes 


=f =f 
p(r + »(*) p’g”"’ — q(n - Al, + ) pig’ = 0, 


which is obviously true. 

Thus each of the (,%,) vectors and hence any linear combination thereof 
will be vectors in the linear subspace perpendicular to the c(x;, ... , X,) vector 
and passing through the origin. Because each coordinate of the c(x:, . . . , X,) 
vector is positive, each vector in the linear subspace must have at least one 
coordinate negative unless all are zero. Thus the only vector in the subspace 
and in the first orthant is the zero vector. This means that the (*) values of 
o* (x1, ...,X,) are zero. 

From $*(x1, ..., X,) = 0, we obtain that 


oe, (x1, re | %a) = “Oe, (X1, occ 9 Sp) 


so long as all x values lie outside [6,,62]. This equality of the é functions implies 
by the concluding remarks to Lemma 2.2 that the y functions are also the same 


Wo, (x1, re | Xn—1) = We, (x1, sae \— Mites 


and hence Wo(x1, . . . , X,-1) is constant valued as a function of @ except possibly 
for jumps at the points x;, ... , X,-:. Letting P,; stand for the jump at x, », we 
have 

i—1 
(4.5) We(x1, eee Xn—1) = ps P (x1 peceel Xu) if Xu-y <9 <& Xp. 


j=0 











56 D. A. S. FRASER 


For fixed x1, ... , Xn, @(x1, . . . , X,; R) is a real valued random variable; it 
has distribution function 
Pre{a(x1,...,%.;R) > 0} = Prel@ € S(xi,...,%0;R)} = de(xr,..., Xe). 


Thus from (4.5) and (4.3), we obtain (4.2). The converse follows from (4.3) 
and the definitions of the functions involved. This completes the proof. 


It might seem at first sight that a result similar to that above would apply 
to confidence intervals, viz., that they would be the interval between two 
order statistics chosen randomly. We give two examples of confidences intervals 
for which the bounds are not both order statistics. 


Example 1. Let 8 = .25, p = .5, and m = 2. Let f(x:,x2) be any real valued 


function such that x1) < f(x1,x2) < x;). Then a .25 similar confidence interval 
for the median is 


S(x1, x2; R) [x<1, f (x1, X2)] fR< 5, 
Lf (x, X2), X¢2] ifR>.5. 


Example 2. Let 8 = §, p=j, and n =2. Then a $ similar confidence 
interval for the p-percentile is 


S(x1, x2; R) = [xq, x] if x) < 0, 
= [xi», 0) if xa) < O < xq, 
= (0, x,)] if0 < xq. 


These are easily checked. A high confidence level can be obtained for either 
example by taking a larger value of n. 


The theorems above supply us with some indication of the form of similar 
confidence regions for the location parameter £,(F). It is perhaps natural then 
to look for confidence regions possessing certain optimum properties. The 
following properties which one might require of confidence regions were intro- 
duced by Wald. However, since there is almost a complete analogy between 
confidence region theory and hypothesis testing theory we shall use the names 
which are standard for tests. We have the following definitions: 


DEFINITION 4.1. The power function of a confidence region for t,(F) is 


(4.6) Pr(@) = Pre(6 ¢ S(Xi,...,X_;R)) = Erv(1 — o0(Xi,...,2 mane. 
DEFINITION 4.2. S(x1, . . . X,; R) is an unbiased confidence region for ,(F) if 
(4.7) Pr(@) > Pr(é&(F)) 


for all distributions F € §o. 


The following theorems give us confidence regions for =,(F) with optimum 
properties. 


THEOREM 4.3. A most powerful (one-sided) similar confidence region for 
t,(F) is] — ©, u(xi,...,X,; R)[ where 





or 





—s 


SCALE AND LOCATION PARAMETERS 57 
(4.8) u(x1,...,%3R) = XH with probability 1 — a 
= X(441) with probability a. 


This confidence region has maximum power for 0 > &,(F) among all similar con- 
fidence regions. 


Proof. Similar confidence regions must have P,(t,(F)) = 1— 86. For 
6 = 0 and £,(F) = 0, we obtain 


Er(o0(X1,...,Xn)) = B. 


Lemma (2.1) gives the restriction 


(4.9) ps a poncd ie *) bo (x; = ‘,) = 3 ae. 


z—2z.y 


a x,). For a distribution F(x) having F(0) = Pr = 1 — Q,, let 
the probability density f(x) = Prf_(x) + Orfs(x) where f,(x) and f_(x) are 
as defined in Lemma 2.1. Setting 


F,(x) = f f+(x) dx, F(x) = | f(x) dx, 
0 —o 
the power function of the confidence region satisfies 


(4.10) 1— P,(0) = | 7 Ope #(wa)) pp n— ta(us),....2(t4e)) 


{0,1)* z=G--*,.G@+~* 
n 


do(x(u1),..., x (ty) ) I] du,. 
1 


Since we are considering the value of @ = 0, our problem is to maximize the 
power for 0 > ¢,(F), that is, for p < Py. A solution to this maximization is 
obtained by minimizing the integrand of (4.10) subject to (4.9) and the restric- 
tion that the values of ¢» belong to [0,1]. This is a simple binomial distribution 
problem with solution 


dolxs, ..- 5! %,) = 0 if 2(x1,...,%) <n — 4, 
=a if ¢(xy,..., X,)=n—1 
= ] if ¢(x3,...,3 %,)>n—i 


for some a, 1. 
It will be derived as a corollary to Theorem (5.1) that the confidence region (4.8) 
is a most powerful (one-sided) B confidence region. 


THEOREM 4.4. A most powerful unbiased confidence region for §,(F) is 
(4.11) [f(xs1,..., 3 R), g(x, ~~... X03 R)] = [xco, Xcaay] with probability p,, 
= [xXci41), X(4y] with probability po, 


[x¢», Xc14-542)] with probability ps, 


[x¢441) Xca4- y+) with probability ps, 











58 D. A. S. FRASER 


where 1, j, Pi, P2, Ps, ps are chosen to make the interval unbiased with confidence 
level 8. This confidence region has maximum power for 0 ~ &,(F) among unbiased 
similar confidence regions. 


Proof. The proof using Lemma (2.1) follows closely that used in Theorem 


4.3. It is worth noting that there remains one degree of freedom in the choice 
of the p’s. 


That the confidence region (4.11) is a most powerful unbiased B confidence 
region for §,(F) will be derived as a corollary to Theorem (35.2). 


5. Tests for the location parameter. We obtain most powerful and most 
powerful unbiased tests for some hypotheses concerning the location parameter. 
Consider first a hypothesis completely specifying the location parameter. 

Hypothesis 1: &,(F) = 0, F € §o; 
Alternative 1: &,(F) > 0, F € &o. 
For the problem of obtaining a test, we have the following 

THEOREM 5.1. The one-sided sign test applied to (x; — 0, . . . , X» — 8) ts 

most powerful for the Hypothesis 1 against the Alternative 1. 


Proof. For simplicity assume @ = 0 and consider a distribution belonging 
to the Alternative 1. It will have a density function f(x) which can be decom- 
posed into f_(x) and f,(x) as in Lemma (2.1) giving 

f(x) = p’f_-(x) + q'f+ (x). 

Following a procedure used by Lehmann in (4), we look for a distribution 
over the parameter space of Hypothesis 1. The obvious choice for such a least 
favourable distribution is to give probability one to the distribution fo(x) = 


pbf_(x) + gf+(x). For a sample of m the most powerful test of fo(x) against 
f(x) is given by the test function. 


if I ses <, 
Il fo(x:) 
if fed = ¢ 


}(x1, ...,%s) 


=a 

I] fo(x.) 
= 0 if Ise) < ¢ 

I] fo(x,) 

Obviously this is equivalent to 

(x1, one 5 aed = ] if U(x pecege x. x > to, 
=a ' if i(x; geees Xn) = lo, 
= 0 if a(x.,...,%0) < te 


More generally the sign test is based on the (x; — @)’s. 











SCALE AND LOCATION PARAMETERS 59 


COROLLARY. The one-sided sign test is most powerful for the Hypothesis 2 
against the Alternative 2. 


Hypothesis 2: §,(F) € S, F € ¥e 
Alternative 2: &(F) € S’, F € &o. 


S and S’ are sets on R' having sup S < inf S’. The test is based on the signs of 
x, — supS,...,x, — supS. 


Proof. Follows easily from Theorem 5.2. 


COROLLARY. The confidence region (4.8) for &,(F) is a most powerful (one- 
sided) 8 confidence region. 


Proof. By straightforward analogy from the Theorem. 
THEOREM 5.2. The unbiased two-sided sign test is most powerful unbiased for 


Hypothesis 3: §,(F) = 0 
Alternative 3: &,(F) # 0. 


For the proof of this theorem the following lemma is needed. 


LEMMA 5.1. Amy unbiased test of Hypothesis 3 against Alternative 3 is a 
test similar over Hypothesis 3. 


Proof. Consider a distribution belonging to Hypothesis 3 and having a 
continuous density function f(x) for which f(0) > 0 and 


f(x + €) < G(x) (le! < 8) 


where G(x) is integrable. The power of an unbiased test @(x;, . . . , x») is 
foc, ace Xm) TT f(x, + eT] dx x, 
and is a continuous function of «. Since we have assumed $(x:, . . . , X,) to be 
unbiased of, say, size a, we have 
fooler...) I fer+ oT dx. >a if e + 0. 


From the continuity we obtain 


feces ea im) [1 feo T] dx; >a; 


but since f(x) corresponds to a distribution of Hypothesis 3 we have that the 
above expression is less than or equal to a. Therefore 


f oc, eee te [fed T] dx, = a 


for all distributions belonging to Hypothesis 3 which have a continuous density 
satisfying the bounding condition. Such a class of distributions can replace 
§0() in Theorem 2.1 with the results remaining valid. Hence we have con- 
ditions on $(x;, . . . , x,) and obtain that it is a test similar over Hypothesis 3. 


This proves the lemma. This type of argument from unbiasedness to similarity 
was used by Lehmann. 











60 D. A. S. FRASER 


Proof of Theorem 5.2. From Lemma 5.1 any unbiased test is a similar test 
and hence by Lemma 2.1 has the form (2.4). Following the argument used in 
Theorem 4.4, we obtain a solution test function 


O(x1,...,%) = 


=C if X i+) < 0 < X(i+94+1)> 


| 
hm 


if X(4+9+1) <& 0, 


=0 if xan <0 < Xu», 
zc’ if x» <0 < Xun, 
=1 if 0 < xX, 
where i, j, c’, c’’ are chosen to make the test unbiased of size a. This completes 


the proof. 


COROLLARY. The confidence region (4.11) is a most powerful unbiased 8 
confidence region for §,(F). 


Proof. By straightforward analogy from the Theorem. 


THEOREM 5.3. The most powerful unbiased test of Hypothesis 3 against Alterna- 
tive 3 is most stringent if p = }. 


A test is most stringent if it minimizes the maximum difference between 
envelope power and power. Thus if ¢ (or ¢’) is a test function of size a for a 
hypothesis H, then ¢* also of size a for H is most stringent against the alterna- 
tive hypothesis 7 if ¢* minimizes 

sup wy Er(¢’) — Er(¢)) 
Fe \¢' 
as a function of ¢. 

Proof. Let f:(x) be the density of any distribution belonging to Alternative 
3. Then fi(x) = p’f_(x) + q’fs(x). fo(x) = q’f_(x) + p’f,(x) is also the density 
of a distribution belonging to Alternative 3. From symmetry and Theorem 5.1, 
we know that the envelope of the power functions for size a tests of Hypothesis 
3 has the same value for these two distributions. 

For a least favourable distribution over Hypothesis 3 we would choose all 
probability for the distribution having density f(x) = }f_(x) + $f,(x); and 
for a distribution over the two alternatives mentioned above we would take 
probabality 4 for each. A most powerful test for this reduced problem is 


ALLA) + 11 feo , 
I] f@) 

TT Ae) + 411 fled 
I] f@s) 

0 if sT] filxs) + 311 feed) < 

[1 fe) 


_ 


$(x1, “ees Xn) 





’ 





= K, 


K. 




















SCALE AND LOCATION PARAMETERS 61 


But 
HTT fed) + HTD fed) _ (Zz) “(Z) + s(2) (@) 
, *\3 A} ’ 
[fed ; : ’ . 
where 1 = 1(x;, .. . , X,). From this it is easily seen that-the test is the one 


obtained in the previous theorem. 

his test is similar, it maximizes the minimum power for the two simple 
alternatives (since the power is the same for these two alternatives), and the 
test is independent of the alternatives used in its derivation. By the theorem of 
Hunt and Stein (5), the test is most stringent. 


6. A bivariate problem. A familiar statistical problem is the following. 
Observations are obtained in pairs (x;,y,). The x; and y, values come from the 
same, say plot, and the y value is the result corresponding to some ‘“‘treatment”’ 
while the x value corresponds to no treatment. The problem is to find whether 
the y values tend to be larger than the x values. Often the x and y components 
cannot be assumed independent, and perhaps no assumption can be made 
concerning the joint distribution. 
In such a situation one or other of the following formulations might be a 
suitable idealization of the problem. F(x,y) has a density f(x,y): 
Hypothesis I: §5(F(@,y)) — §£.5(F(x,)) = 0, 
Alternative I: &5(F(@,y)) — §.s(F(x,)) > 0. 
Hypothesis II: § ;(G(z)) = 0, 
Alternative II: — 5(G(z)) > 0, 

where G(z) is the distribution function corresponding to y — x. 

It seems to the author that second formulation is more realistic. In any case 
it is the second formulation for which this paper gives an answer. 


THEOREM 6.1. For a sample of n from a distribution F(x,y) having a density, 
the one-sided sign test is most powerful for formulation 11. 


Proof. Consider an alternative distribution having density f,(x,y) and let 
filx, y) = pf_(x, y) + afs(, 9), 
where f_(x,y) and f,(x,y) are respectively density functions over the regions 
{(x,y)|y — x < 0}, {(x,y)|y — x > 0}. Obviously p < }. 
The distribution with density f(x,y) = $f_(x,y) + 4f,(x,y) belongs to the 


Hypothesis. Giving this distribution probability one as a least favourable 
distribution over the Hypothesis, we obtain the test 


d(x 91) = 1 ¢ UAlo9d S 
I] fo(x4, Vi) 

= bk if Thin) _ | 
I] fo(x, Vi) 

=0 if TL At. ys) a 











62 D. A. S. FRASER 


This is the one-sided sign test mentioned in the statement of the theorem. Since 
it is a similar test, it is a most powerful test of the Hypothesis against the 
particular alternative. However the test does not depend on the alternative 
used in the derivation; hence it is a uniformly most powerful test for 
formulation II. 

The sign test utilizes only the signs of the differences y, — x,. A test based 
on the signs of the differences y; — x, and on the ranks of the numbers ly, — x), 

. Yn — %q| was proposed by Wilcoxen (9); the procedure for applying the 
test is similar to that for the Wilcoxen (Mann-Whitney) two sample test. 
This sign-rank test is designed to test the more restricted hypothesis: 

Hypothesis III: f(x,y) is symmetric about y — x = 0. This hypothesis 
requires that £.5(G(z)) = 0 and in addition that the distribution is symmetric 
about z = 0. 

For the restricted Hypothesis III, conceivably the Wilcoxen sign-rank test 
(of size a) could be more powerful for certain alternatives than the sign test 
(of size a). 


For the Hypothesis II the Wilcoxen sign-rank test does not apply (the size 
determination in (9) presupposes Hypothesis II1), and the sign test as was 
shown above is most powerful. 


7. Definition of scale and location-scale parameters. Conditions that 
parameters be scale and location-scale parameters for a class of distributions 
as large as > could be formulated along the lines followed for the location 
parameters in §3. The result however would be the definitions: 


DEFINITION (7.1). The scale parameter is n(F) = &,(F) — &,(F). 

DEFINITION (7.2). The location-scale parameter is (£,,(F), &,(F)). 

8. Confidence regions and tests for the scale parameter. It is not difficult 
to find confidence intervals for the scale parameter; they may be derived from 
the order statistics. No attempt is made to get a best confidence interval, 


but rather a result of negative nature is obtained; the nonexistence of similar 
confidence regions. 


THEOREM 8.1. Similar 8 (80,1) confidence regions do not exist for n(F) 
(other than degenerate regions). 


To prove this theorem we need an analogue of Theorem 2.1 for distributions 
having two percentiles fixed. 


THEOREM 8.2. For the class of distributions 
a = (I F(x.)|F() € Bo, &.(F) = a, &(F) = 5} 
a symmetric unbiased estimate of zero has the form 
(8.1) $(x1,...,%) = > ar(% 5) We(X2, - - - » Landy Seek, + - > » Se) 
+> B(x) Wolxa, . . . » ¥e-2) S42) - > >» Xa), 


























SCALE AND LOCATION PARAMETERS 63 


where 
B(x) = + pe ifx > b, 
= — (1 — po) ifx <b, 
a(x) = + pi ifx >a, 
= — (1 — p;) ifx <a, 


and ¥, and y» are bounded and symmetric. 


Proof. Although the proof of this theorem can be given quite similarly to 
that of Theorem 2.1, we outline another form, the steps of which can be used 
in the proof of Theorem 8.1. 

p> See f,(x) are arbitrary bounded measurable functions, we define 


O(fi,.--sSn) = Joc, eee x) T] fxd T] dx ;. 


Since ¢(f,...,f) = Oiff € Fas, it follows by the method of proof in (3) that 
O(fi,.-- fn) = Oifallf; € Far. 

Defining a(f), 8(f) in the manner used for $(/:,...,/,), we have $(/1, 
--+9Jta) = 0 if a(f) = B(fi) =... = B(f,) = 0 where the f’s may be linear 
combinations of elements of fo. If fot and fe have a(fi) ~ 0, B(/u*) = 0 
and a(fo*) = 0, B( fe’) ¥ 0, then 


-~ calf) i B(f) -8 
o(f, fe ,* eee fa) a(fo’) o(fo oii “* ota) B(fe ) o(fe Se ,* eee £,) 
will be zero if a(f2) = B(f2) =. . . = B(f,) = 0. This obtains from a simple 


analysis of linear functions over a vector space. Proceeding in this manner, the 
expression (8.1) is obtained fairly easily. 


Proof of Theorem 8.1. Letting S(x:, . . . , x,; R) be a confidence region for 
n(F), we define a corresponding characteristic function. 


l 
oy(x1,..-, X,) = =p > Pra{y € S(x,,,.-.,%.;R)}, 


where the summation is over the m! permutations. If S is a similar confidence 
region, then 


Ev{o(X1 geeogd xX n)} = B 
for all distributions belonging to §,,2,4.. Thus 
by (X1 peers Xn) = B = o, (x1, coe yg My) 


is an unbiased estimate of zero for F € ¥z.2,+, for all x». Letting a,, (x), B,, (x) 
be as defined in Theorem 8.2 with a = x») and b = x» + 7n, then 


a Ze * fa zo (f Hs 
One fa) — SD G6 finda) — BA PU Sf) 


is equal to zero whenever 











64 D. A. S. FRASER 


az,(fs) = Bs.(f2) =... = Bs. (fa) = 0. 


If in addition fo, . .. , fr, fot, fo® are functions equal to zero in the intervals 
[ao, do + €], [ao + 7, ao + 9 + «€], then by taking f to be nonzero only in [a», 
ay + 4e] or in [ao + 9 + $e, @o + 9 + €] and by changing x» from ay to ay + fe 
to do + «, the second and third terms are changed successively in sign without 
altering the value of the expression as a whole. Hence ¢*(fo*, fe, . . . , fx) and 
o* (fe, fe, . . . » fn) are equal to zero if ag(f2) = B.(f2) =... = B(f,) = 0 and 
if these functions are zero in (a, a + e] and [ja + 7,a+ 7+ €]. 

Proceeding in this manner we obtain finally that if fo, .. . , fo™ are equal 
to 1 respectively on the disjoint intervals J,,..., J, and are zero elsewhere, then 


Jn I @ (x; siabeae Lo) I] dx, =0 


and $*(x;, ...,X,) = 0 almost everywhere. 
Thus we obtain ¢" = 8. This means that given the order statistics x), 
. » Xq@ the probability is 8 that S(x, ..., x,; R) covers any arbitrary positive 
real number. Such a confidence region is essentially equivalent to 
S(x1,..-.,%3R) = P, of if R < B, 
=@¢ ifR> 8B, 
which we refer to as a degenerate confidence interval. This completes the proof. 
By analogy we have 
THEOREM 8.3. For the hypothesis n(F) = mo, similar tests other than (x1, 
., Xn) = ado not exist. 


9. Distribution-free upper tolerance bounds. A distribution-free upper 
tolerance bound is a particular type of distribution-free or non-parametric 
tolerance region. We first define this latter concept. Let .% & be a measurable 
space, that is, is an arbitrary space and & is a class of subsets A of .Y which 
form a o-algebra, and let {P»(A)\@ € 2} be a class of probability measures 
over the space -* 

A tolerance region for the class of measures {P4(A)|\@ € Q} is a function A(x, 

., Xn) which maps SY” into the class XA and for which the distribution of P»(A (x, 

. , Xn)) induced by the product probability measure P¢" over /” is independent 
of 9 EQ. 

Weaker forms of this definition have been used in particular problems. 
For a distribution-free upper tolerance bound we need .% = R', {P»(A)|@€ Q} 
= Fo, and A(m,...,%.) = ] —@, f(x1,..., Xx). 

In 1944 Robbins (6) considered the problem of finding the most general 
distribution-free upper tolerance bounds. He proved that, subject to continuity 
restrictions on the function f(x1, ... , Xn), 


n 


I] (f(xs,..., x.) — xd = 0. 


1 





~~ 








SCALE AND LOCATION PARAMETERS 65 


Here we remove the continuity restrictions and envisage randomized bounds 


f (x1, . . . , Xn; R) where R is a random variable with a uniform distribution on 


(0,1]. Our result is in effect that f(x:, . . . , x,; R) chooses the order statistics 
with fixed probabilities. 

The problem of the most general bound can be given quite interestingly as 
a measure problem. In (7) it was shown that any continuous distribution over 
R' can be obtained from the uniform distribution over [0,1] by a monotone 
strictly increasing mapping of [0,1] into R', and conversely. Let such a mapping 
be g(u), corresponding to the distribution function G(x). Then essentially g(u) = 
G-'(u) where G—' is the inverse function of G(x). Also let @ be the class of all 
continuous distribution functions G(x). Then to find the distribution-free 
upper tolerance bounds is to find functions f(x, . . . , x,; R) for which the 
Lebesgue measure of 


Ga. - + ny Uny1)|G(f(G-"(u1), -.., G—" (tun); Unti)) < 2, 
(uy peewee Unsi) € (0,1)"*"} 
is independent of G(x) € @, for all v € [0,1]. 

THEOREM 9.1. A necessary and sufficient condition that f(x, ..., %,; R) bea 
distribution-free upper tolerance limit is that f(x,,, ... , X43; R) chosen with prob- 
ability 1/n! for fixed x1), ... , Xn) Should be equivalent to x.1), . . . , Xm), Chosen 
with fixed but arbitrary probabilities pi, ..., p, respectively where > p, = 1 
(almost everywhere ©). 

Proof. Let H(v) be the distribution function of 

Po() — ©, f(X1,..., Xai R)[) = GY(Xi, . .. , Xa; R)) 
when the X,, .. . , X, have the distribution function G(x) € @. Then 
H(v) = Pre{G(f(Xi,...,Xa; R)) < 9} 
Pro{f(Xi,...,Xa;R) < G'(e)} 
Pre {f(X1, eee - m R) < §,(G)}. 


Thus f(x;, ... , X,; R) is a similar 1 — H(v) confidence bound for £,(G). Letting 
oo(x1, . . . , X,) be the characteristic function of the region ] — ©, f(x:,...,3 =; R)I, 
Lemma 2.1 gives conditions on ¢» which for @ = 0 are 

} v” i(21,..-. | = een **) bo(x1, re Xn) _ 1 = H(v) 

r=2.y 
where the z, < 0 and the y; > 0. Not only must this hold almost everywhere 
(21, ..-,2ni Vy ---»¥n) € |] —@, Of" X JO,~[", but it must hold for all v. The 
above equation determines the form of the right hand side 


1— Hw) => Ca” ‘(1 — 2)‘, 


and this in turn implies 











66 D. A. S. FRASER 


almost everywhere. Similarly we obtain 
oo(x1, sees Xn) = Ces.—6.....8e—O): 
Setting 
“ } P; 
j= 


and assuming f(x1, ... , X,; R) to be symmetric in the x’s, we obtain 
Sf (x1, ...,%,;R) = + with probability po, 
=X) with probability pi, 


=X) with probability p,, 


= —© with probability 1 — >> p,. 
0 


Then with the obvious modification if f(x;, . . . , x,; R) is not symmetric, the 
theorem is proved. 


10. Two theorems in estimation theory. In their 1950 paper (8) Lehmann 
and Scheffé give several theorems for unbiased estimation. One of these defines 
the class of uniformly minimum variance (UMV) unbiased estimates given the 
class of unbiased estimates of zero. For non-parametric application this was 
restricted in that they considered only estimates with finite variance over the 
parameter space. Since there are reasonable non-parametric estimates not 
satisfying this condition, consider an extension of their results. 

Let {P70 € 2} be the class of distributions under consideration and let 
T(x) be a sufficient statistic with corresponding distributions {P,7\@ © Q}. 
We now consider the estimation of real valued functions g*(@) which exist over 
the parameter space. 


LEMMA 10.1. Under the assumption that all statistics in vo have finite variance, 
a statistic 1s a minimum variance (UMV) unbiased estimate of its expected value 
if and only if it belongs to v, where 
vo = {f(t)|Eotf(T)} = 0,6 € Q}, 
vi = {g(t)|Eotg(T)f(T)} = U@ € 2, f(t) € vo, Vareg(T) < @); 
Esig(T)} < © (@ € Q)}. 
Proof. Since we are concerned with UMV unbiased estimates, the Rao- 
Blackwell theorem says that we may restrict attention to estimates based on 
the sufficient statistic. 
Let g(t) be a UMV unbiased estimate of g*(6). If f(t) © ve then g(t) + Af(t) 
is also unbiased for g*(@) and must have variance at least as large as g(t). 
Vare{g(T) + Af(T)} = Vare{g(7)} + 2dE£elg(T)f(T)} + X Varelf(T)} 


> Vare{g(T)}. 

















t) 














SCALE AND LOCATION PARAMETERS 67 


If Vare{g(7)} is finite then the above inequality being true for all positive and 
negative A implies that Es{g(T)f(T)} = 0, that is g(t) € ». 

Next assume that g(t) € »; and let g’(t) be any other unbiased estimate of 
g* (0). Then g’(t) — g(¢) is an unbiased estimate of zero, say f(t), and 


Vare{g’(T)} = Vare{g(T) + f(T)} 
Vare\g(T)} if Vare{g(T)} = @, 
Vare{g(T)} + Varel{f(7)} if Vare{g(T)} < @. 


In either case we have Vare{g’(7)} > Vare{g(7)} which means that g(f) is a 
UMV unbiased estimate of g*(@). This proves the lemma. 


Also for convex loss functions we have the following 


LemMMA 10.2. If a real valued parameter g(0) has a minimum risk unbiased 
estimate, then the estimate is unique almost everywhere |P¢*\ (assuming the loss 
function is strictly convex and the risk finite). 


Proof. Let We(f) be the strictly convex loss function; then 
aWe(f) + (1 — a) Wolf’) > Welaf + (1 — a)f’) 
if a € )0,1[ and f +f’. Suppose f(t) and f’(t) are minimum risk unbiased 
estimates of g(@) with 
h(0) = Eo(W2(f)) = Ee(We(f')). 


It follows that af + (1 — a)f’ is an unbiased estimate of g(@). Since f and f’ 
are minimum risk estimates, then fora € ]0,1[ 


Es| Wolof + (1 — a) f’)} > h(6); 
but 
Eo| Wolof + (1 — a) f’)} < EslaWe(f) + (1 — a) We(f’)} 
with inequality strict unless 
Wolof + (1 — a) f’) = aWe(f) + (1 — a) Wel’) 


almost everywhere which implies f = f’ almost everywhere. But by combining 
the inequalities we see that they are equalities and hence f = f’ almost every- 
where. This proves the lemma. 


11. Some examples of estimation in non-parametric theory. For a sample 
X,,...,X» from an unknown distribution assumed to be absolutely continuous, 
the order statistics x,1), . . . , X~@) form a complete sufficient statistic. However, 
the Lehmann-Scheffé theorem on minimum variance and minimum risk un- 
biased estimation can not be applied immediately in most cases. 

Consider the estimation of E,(X*). The essential step in the Lehmann-Scheffé 
theorem is in showing that the estimate which depends on x,), . . . Xi) is unique. 
Let fi(xqp, . . . Xq@ and fe(x i, . . . , X@)) be two such unbiased estimates; then 
tf: — fe is an unbiased estimate of zero for those distributions for which E»(X*) 











68 D. A. S. FRASER 


is finite. Both §: and §; consist of such distributions; hence f; = f: almost 
everywhere. Thus it is essential to check that the sufficient statistic is complete 
for the distributions for which the parameter in question exists. 

As a second example we consider the problem of obtaining minimum variance 
unbiased estimates of the parameters of the distributions §o(p) (p ¥ 0, 1). 
We restrict attention to parameters g(@) which exist at least for a minimal 
class sufficiently large that the class of unbiased estimates of zero contains 
only estimates with finite variance and we apply Lemma 10.1. vo contains 


statistics f(x), . . . , Xq@) satisfying (2.4). 
A statistic g(x,1, . . . , Xq@)) in » will satisfy 
Elg(xwm,.--,Xm) f(%m,---,X%@w)} = 0 

whenever Var g < ©. For those statistics having finite variance for the minimal 
class of distributions mentioned above, f(x,y, . . - , X@) g(Xa, - - - » Xq@) will 
also satisfy (2.4). Thus for every f(x,1, . . . , X@)) satisfying 

a (2) U(21,.++. *) (x1, ee Xp») a 0, 

z—zZ.y p 
we have 


SPeccces Zs) 
= (2) f (x1, ~~~» Xn) B(%1, . -- Xn) = 0. 


It follows that g(x:, . . . , x.) = 0 and hence that there are no nondegenerate 
minimum variance unbiased estimates which have finite variance over the 
minimal class of distributions. 


REFFRENCES 


1. P. Halmos, The theory of unbiased estimation, Ann. Math. Stat., 17 (1946), 34-43. 

2. E. L. Lehmann, Notes on the theory of estimation, Lecture notes mimeographed at the 
University of California. 

3. D. A.S. Fraser, Completeness of order statistics, Can. J. Math., 6 (1954), 42-45. 

4. E. L. Lehmann and C. Stein, Most powerful tests of composite hypotheses. 1: Normal distri- 
butions, Ann. Math. Stat., 19 (1948), 495-516. 

5. E. L. Lehmann and C. Stein, On the theory of some non-parametric hypotheses, Ann. Math. 
Stat. 20 (1949), 28-45. 

6. H. Robbins, On distribution-free tolerance limits in random sampling, Ann. Math. Stat., 
18 (1944), 214-217. 

7. H. Scheffé and J. W. Tukey, Nonparametric estimation. 1: Validation of order statistics, 
Ann. Math. Stat., 16 (1945), 187-192. 

8. E. L. Lehmann and H. Scheffé, Completeness, similar regions, and unbiased estimation, 
Sankhya, 10 (1950), 305-340. 

9. F. Wilcoxen, Individual comparisons by ranking methods, Biometrics, 1 (1945), 80-83. 


University of Toronto 














re 














UNITARY TRANSFORMATIONS 
B. E. MITCHELL 


1. Introduction. We consider the problem of finding a unique canonical 
form for complex matrices under unitary transformation, the analogue of the 
Jordan form (1, p. 305, §3), and of determining the transforming unitary matrix 
(1, p. 298, 1. 2). The term “canonical form’’ appears in the literature with dif- 
ferent meanings. It might mean merely a general pattern as a triangular form 
(the Jacobi canonical form (8, p. 64)). Again it might mean a certain matrix 
which can be obtained from a given matrix only by following a specific set of 
instructions (1). More generally, and this is the sense in which we take it, it 
might mean a form that can actually be described, which is independent of the 
method used to obtain it, and with the property that any two matrices in this 
form which are unitarily equivalent are identical. 

Toeplitz settled the question for normal matrices in 1918. Perhaps the first 
canonical form for non-normal matrices was given by Réseler (7) in 1933. He 
used Frobenius covariants to obtain various triangular forms for special classes 
of matrices. Currie (2) gave a triangular form for a general matrix, but his 
work has not yet been published. 

In this paper we give a complete solution to our problem as stated above 
for non-derogatory matrices, and a partial solution for the derogatory case. 
The solution includes a partial solution to the following allied problem: What 
conditions on the non-diagonal elements must hold for 7; to be unitarily equi- 
valent to 7, when J and 7, are two triangular matrices with the same diagonal 
elements? 


2. Acanonical form. We begin with some preliminary material. 


LemMA. If $1, ..., 6, 1S a set of normalized orthogonal vectors, then there 
exists a unitary matrix with 1, ... , 6, as its first r rows. 


THEOREM 1. For any matrix A there exist unitary matrices U, V such that 


UA = T, is triangular (with 0's above the main diagonal) and AV = T, 
triangular (with 0's below the main diagonal). 


1s 


Suppose A = (a;,) and ¢; = [yi,..., 92]. The requirement that ¢,A = [*0... 0] 
with 0’s in the last m — 1 places yields a set of m — 1 linear homogenous equa- 
tions in the m unknowns ¥, . . . , ¥Y,, which always has a non-trivial solution. 
Thus a non-zero ¢; may be determined and we may suppose it normalized. 
If U, is a unitary matrix with ¢; as its first row then UA has 0’s above the main 
diagonal in the first row. By induction the proof for UA is complete. 


Received August 21, 1952; in revised form August 4, 1953. The writer appreciates the 
referee’s suggestions for improving the exposition. 


69 











70 B. E. MITCHELL 


By working on the other side we may show similarly that AV = 7+. 


Coro.Luary | (Schmidt). Jf P is non-singular there exists a (non-singular) 
triangular matrix T with 0's above the main diagonal such that PT is unitary. 


COROLLARY 2. Any set of matrices which may be simultaneously triangularized 
by similarity transformation may be simultaneously triangularized by unitary 
transformation. 


Let P be a matrix which reduces A to Jordan normal form (3, chap. 6), 
P“AP=C=C,+...+,, where the C,; are the non-derogatory blocks 
in the Jordan form. Let T be a triangular matrix with 0’s above the main 
diagonal such that PT = U is unitary. Then U*AU = TCT is triangular. 
Moreover if 7 is partitioned in accordance with C so that the diagonal blocks 
are 7,,..., 7, then the ith diagonal block of T-'CT is 7 ;-' C, T; and hence is 
similar to C,. 


THEOREM 2. Any matrix may be unitarily transformed to triangular form 
with diagonal blocks A, . .., Ax which are respectively similar to the diagonal 
blocks Ci, ..., C, in the Jordan form. 


This theorem might have been obtained from consideration of linear trans- 
formations (5). It has been given in terms of matrices since the uniqueness 
proof is in the latter form. 

Suppose A is non-derogatory and U*AU = B has this form for a unitary 
U, and C=C,+...+G is the Jordan form of A. Then B is similar to C, 
say T-'BT = C. Partition B and T in accordance with C so that B = (B,,), 
T = (T,,);4,j = 1,2,...,8. 

Consider the elements above the main diagonal in BT and 7C. Comparison 
of the elements in the first row and second column gives By,732 = Ti2C2. As 
A is non-derogatory, By, and C, have no characteristic root in common and 
hence 7,2 = 0 (4, p. 90). Similarly 713, . . ., Ty are 0. Following this procedure 
with the remaining rows shows that T is a triangular block matrix. In particular, 
then, 

Bali = Tul, (6-12. ....M. 


But element-wise comparison of the elements above the main diagonal shows 
that 7, is triangular for all 7. Hence 7 itself is actually a triangular matrix. 
Since UT = P is a matrix which reduces A to Jordan form, we see that, for a 
non-derogatory matrix A, any unitary matrix which transforms A to the form 
given in Theorem 2 is obtained from a matrix which reduces A to Jordan form 
by multiplying it on the right by a triangular matrix. 

Let us determine then the degree of uniqueness of a matrix which reduces 
A to Jordan form. If both P and Q reduce a non-derogatory matrix A to Jordan 
form C, then A = PCP-' = QCQ™ and so CP-'Q = P-'QC. Hence we consider 
the equation CX = XC. Partition X according to C so that 


X = (X;;) (ij = 1,2,..., 2). 











UNITARY TRANSFORMATIONS 71 


Comparison of the elements off the main diagonal shows, as before, that X ,, = 0 
for 1 * 7. Comparison of elements on the main diagonal gives X y,C, = C,X «; 
and hence that X ,, is triangular with order equal to that of C,. Thus 


P"Q=R=Ry+...+ Ru 


where R,, is triangular with order equal to that of C;, and Q = PR. 

We now determine the degree of uniqueness of the transforming unitary 
matrix. Suppose U and V are two unitary matrices which transform a non- 
derogatory matrix A to the form of Theorem 2. Then there exist triangular 
matrices 7; and T; such that UT, = P, VT; = Q, where P and Q both reduce 
A to Jordan form. Hence Q = PR, or VT; = UTR. Thus U*V = 7,RT>". 
Now U*V is triangular since it is the product of triangular matrices and hence, 
since it is also unitary, it is diagonal. Thus V = UD. That is, the unitary 
matrix which transforms a non-derogatory matrix to the form of Theorem 2 
is unique up to multiplication on the right by a diagonal unitary matrix. The 
absolute value of every element of a matrix in this form is therefore invariant. 
Let us agree to go from left to right down the successive diagonals below the 
main diagonal and pick out each non-zero element as we come to it until we 
obtain either a total of m — 1 non-zero elements or all non-zero elements off 
the main diagonal, where nm is the order of the matrix. These chosen non-zero 
elements can then be made positive by transforming by a diagonal unitary 
matrix. We thus obtain a canonical form that is invariant under transformation 
by a general unitary matrix. 


THEOREM 3. The form of Theorem 2 is unique for a non-derogatory matrix 
(for a specified ordering of the roots and a convention as to which non-diagonal 
elements will be made non-negative). 


Consideration of the Jordan normal form of a matrix A shows that it is non- 
derogatory if and only if A — AJ has nullity 1 for every characteristic root A; 
that is, there is precisely one characteristic vector for each characteristic root 
(6, p. 45). Hence a triangular matrix with but one distinct characteristic root 
is non-derogatory if and only if the elements in the diagonal below the main 
diagonal are non-zero. These elements can all be made positive on transformation 
by a diagonal unitary matrix. Hence we could have required that the elements 
in the diagonal below the main diagonal of each of the diagonal blocks of Theorem 


2 be positive. 


REFERENCES 


1. J. L. Brenner, The problem of unitary equivalence, Acta Math., 86 (1951), 297-308. 

2. J. C. Currie, Unitary-canonical matrices, Abstract no. 264, Bull. Amer. Soc., 56 (1950), 321 

3. C. C. MacDuffee, Vectors and matrices, Carus Mathematical Monograph, no. 7 (Math 
Assoc. Amer., 1943). 

4. , Theory of matrices, Ergebnisse der Math., vol. 2 (Berlin, 1933) 

5. B. E. Mitchell, A canonical form for non-derogatory matrices under unitary transformation, 
Abstract no. 51, Bull. Amer. Math. Soc., 58 (1952), 55. 











72 B. E. MITCHELL 


6. W. V. Parker, The matrix equation AX = XB, Duke Math. J., 17 (1950), 43-51 

7. H. Réseler, Normalformen von Matrizen gegeniiber unitéren Transformationen, Dissertation 
(Leipzig, 1933). 

8. H. W. Turnbull and A. C. Aitken, An introduction to the theory of canonical matrices (Glas- 
gow, 1948). 


Louisiana State University 




















ON LINEAR PARTIAL DIFFERENTIAL EQUATIONS OF 
THE SECOND ORDER HAVING GEODESIC 
SOLUTIONS 


G. F. D. DUFF 


Introduction. The coefficients of the second derivatives in an elliptic or 
hyperbolic differential equation of the second order determine a Riemannian 
metric on the space of the independent variables. A Riemannian space has been 
called harmonic if the Laplace equation corresponding to it has a solution 
which is a function only of geodesic distance in that space. Harmonic spaces 
have been studied in some detail (1; 3). In this note are examined the circum- 
stances under which a non-parabolic second order linear equation may have a 
“geodesic” solution of the type described. It will be shown that the equation 
must be self-adjoint, that the Riemann space corresponding to the equation 
must be harmonic and that the coefficient of the dependent variable must be a 
constant. Conversely, if these conditions are satisfied, the equation has two 
geodesic solutions, one of which is an elementary solution. In the case of elliptic 
equations, the second solution is connected with a certain mean value property 
which is valid in harmonic spaces. 


1. Riemannian metric. Any homogeneous linear partial differential 
equation of the second order, in N independent variables, and which is not 
parabolic, can be written in the form 


(1.1) Llu] = Au+b° Vu + cu = 0. 
Here P 
(1.2) (Vu); = Se 

Ox 


is the gradient vector of the dependent variable u. 
The Laplacian operator 


1 0 a OU 
(1.3) Au =a” (. a™ a, 


is that based on the metric form 

(1.4) ds* = ay dx‘ dx", 

the a, being defined in terms of the coefficients a® in (1.1) by 
(1.5) ana’ = $}. 


The absolute value of the determinant of the a, will be denoted by the letter 
a. In (1.1) also, b is a contravariant vector and ¢ a scalar invariant, both given 





Received December 18, 1952; in revised form February 27, 1953 


73 











74 G. F. D. DUFF 


in advance. The differential equation is self-adjoint, in the invariantive sense, 
if and only if the vector b is zero identically. 

The equation (1.1) is elliptic if and only if the metric (1.4) is (positive) 
definite. Similarly (1.1) is normal hyperbolic if and only if the signature of 
(1.4) is # (N — 2). While we have in mind mainly these two cases, we need 
not for the moment restrict the signature. It will be supposed that all coefficients 
appearing in the differential operator L are four times continuously differentiable 
in the given coordinate system. 

If both b and ¢ are zero, the equation is Laplace's equation Au = 0. It is 
well known that in a flat space (constant coefficients a“), Laplace’s equation 
has an elementary solution of the form log r (N = 2), r?-* (N > 2), where r 
is the distance function in the space. Thus a flat space is harmonic. A Rieman- 
nian space is said to be completely harmonic if the base point from which the 
geodesic distance s = s(P,Q) is measured can be any point of the space. We shall 
assume that the base point is arbitrary, in our theorem. Completely harmonic 
spaces form a rather special class; for instance, they are all Einstein spaces (1). 

We remark that Thomas and Titt (5) have investigated the conditions under 
which an equation (1.1) has a solution which is a power of the geodesic distance. 
In particular they have shown that Laplace’s equation in a space of definite 
signature has such a solution only if the space is flat and N > 2. 


2. Acoordinate system. For convenience we shall use the squared geodesic 
distance 


(2.1) r= (P,Q) =s (P,Q). 


which is always real. It is well known (2, p. 433) that 


2 or or 
. > ] é a ik fap domme Os . 
(2.2) (vr) ® ax! ax ‘ 
and that, for any function F(T) of T alone, 
(2.3) AF(T) = F(T)AT + 4F" (Pr). 


(Here derivatives of F(T) with respect to T are indicated by dashes.) It follows 
from (2.3) that a space Vy is harmonic if and only if AT is expressible as a 
function of T alone (1); thus, Vy is harmonic if and only if 


(2.4) Ar = f(T). 


Let x* = opt be Riemannian coordinates with pole at Q, an arbitrary but 
fixed point (5). Here o is a normalized parameter on the geodesics issuing from 
Q; o takes the values zero at Q and unity on a suitable surface S enclosing Q. 
It is convenient to choose this surface S in such a way that any geodesic line 
through Q meets S in two points which are at the same geodesic distance from 
Q. This distance is not necessarily the same for all rays through Q. If the metric 
is positive definite, however, we may take ¢ proportional to s, so that S is a 
geodesic sphere with centre Q. The components a“ 
differentiable in this coordinate system. 


can now be assumed twice 





~I 


wt 


GEODESIC SOLUTIONS OF DIFFERENTIAL EQUATIONS 


Now we have 


(2.5) r(P, Q) = au(Q) x'x* = o'au(Q) papi. 
The gradient vector AT has components 
(2.6) (VT); = 20au(P) pi = 2eau(Q) pi, 


in this Riemannian coordinate system (4, p. 87). Let a be the modulus of the 
determinant |a,! in Riemannian coordinates, then in view of (1.3) 


: al 
(2.7) Ar = 4,T(P, 0) =2N+¢ =e 
og 
The second term on the right of (2.7) is O(07) aso — 0. 


3. Geodesic solutions. If the coefficient c is zero, the equation (1.1) 
has the trivial solution u = const., which solution we shall exclude in the 
hypotheses of our theorem. In order to allow such solutions as are singular 
for T = 0, we shall suppose only that the solution function is twice continuously 
differentiable for [T #0 (and is defined for [ sufficiently small). We write 
r = r(P,Q), Q being arbitrary. 


THEOREM I. There exists a solution u = F(T) of (1.1), of class C* for T# 0 
and which is not a constant in any neighbourhood of T = 0, if and only if 

(a) the coefficient c is a constant; 

(b) the vector b vanishes identically; 

(c) the Riemann space Vy is completely harmonic. 

Conversely, if (a), (b), (c) hold, there exist two independent solutions of the 
form u = F(T), one of which is an elementary solution, the other being regular 
at the origin T = 0. 


Suppose, to prove the first part of the result, that a solution u = F(T) exists. 
Then from (1.1) and (2.3) we have 


(3.1) LF(T) = 4F"(P)' + F(r){Ar+b’° Vr} + cF(Ll) = 0. 
Now from (2.6) 
(3.2) b° VI = 2au(Q) x‘b* = 2cayp, b* = 2ob,, 


say, where 6, is thus defined. We may assume that F(T) is not identically 
zero for T° small, or else the solution u would be zero, and so we may divide 
(3.1) by F(T) and let T tend to zero. In view of (2.7) and (3.2), we see that 


far’ (r)r + 2NF(r)\ 

' F(T) 

The limit on the right exists, because, by hypothesis, F(T) is a solution function 
and is of class C? for T + 0; and also c(P) is continuous and tends to the limit 
c(Q). However, the right-hand side of (3.3) depends only on the function F(T) 
and not at all on Q. Thus c(Q) is independent of Q and so is a constant. This 
shows that condition (a) is necessary. 


(3.3) c = c(Q) = — lim 
r.,0 


By hypothesis, F’(T) is different from zero for a sequence of values of T 
tending to zero. For such values of T, we may divide (3.1) by F’(T). Noting 











76 G. F. D. DUFF 


that c is a constant, we see that the quantity 
(3.4) Ar +b VI = 2N + 2wb, + O(c’) 
then depends only on I. 
On any non-null geodesic through Q, let P; and P: be points lying in the 
order P,QP; on the geodesic, and such that ¢(P;) = o(P:). It follows that 


xi = —x2, por = — Poe. 
Also, from (2.5) we see that ['(P1,Q) = ['(P2Q). Now let (3.4) be written down 
for P = P,, and P = P:; and let the two resulting equations be subtracted. 
Thus 
(3.5) 2ob,(Pi) = 20b,(P2) + O(o'), 
for any value of IT in the above-mentioned sequence. Dividing (3.5) by 2¢ 
we have 

b.(P:) = 6.(P2) + O(c), 
and the terms appearing in this relation are continuous near o = 0. Thus 
from (3.2), 


(3.6) ax(Q)pirb* (P:) a (Q)po2b* (P2) + O(c) 


— au(Q)pinb*(P2) + O(c). 


Now let « —0, so that P;, P:->Q. Then the functions b*(P;) and b*(P,2), 
being continuous, tend to their limits 5*(Q). Thus 


(3.7) au(Q)pnd*(Q) = — au(Q)pnd*(Q) = 0 


holds for any non-null direction po, at Q. Since we can find N independent 
non-null directions, and since \a x (Q)| is not zero, we conclude that 6*(Q) = 0 
(k = 1,..., N). Since Q is an arbitrary point, the vector b must vanish identi- 
cally. Thus condition (b) is necessary. 

From (3.1), in which the term b- VT is now absent, we see that AT is defined 
as a function of I provided that F’(T) + 0. If F’(T) does not vanish throughout 
any interval, AT is defined by continuity. On the other hand, if F’(T) is zero 
throughout any interval it follows that the product cF(T) must be zero. We 
show that this possibility is excluded by our hypotheses. If c is zero, then 


4rF"(T) + ar F'(r) = 0, 


ll 


and F’(T) has a zero. Thus F’(T) must be identically zero since it is also a 
solution of this homogeneous differential relation. If c is not zero then 


4r F’(T) + AY F(T) + cF(T) = 0, 


and F(T), F’(T) are simultaneously zero. Again, it follows that F(T) must 
vanish identically. According to our hypothesis, therefore, F’(T) is not zero 
throughout any interval. Hence AT is defined as a function of I for all (suf- 
ficiently small) values of I’, and for any base point Q. That is, Vy is completely 
harmonic, which is condition (c). This establishes the necessity of the three 
conditions. 











GEODESIC SOLUTIONS OF DIFFERENTIAL EQUATIONS 77 


Turning to the converse statement, consider the equation 
(3.8) Au + cu = 0, 


in a completely harmonic space Vy, where c is a constant. From (2.3) and (2.4) 
we see that u = F(T) is a solution if and only if 


(3.9) 4Vr F°(T) + f(T) F(T) + cF(P) = 0. 
Setting F = y, [ = x, we see that (3.9) is the ordinary differential equation 
(3.10) 4xy” + f(x) y' +cy = 0. 


We shall refer to (3.10) as the fundamental equation. If the conditions of the 
theorem are met, the fundamental equation is defined, and any solution of it 
yields a solution u = F(T) of the partial differential equation. The fundamental 
equation has two linearly independent solutions. 


4. Nature of the solutions. The origin is a regular singular point of the 
fundamental equation, since from (2.4) and (2.7) we see that 


(4.1) f(x) = 2N + O(x’*), x0. 
Supposing that the solutions can be developed in the form 
(4.2) y = x’y,(x) 


where y,(0) # 0 and y,(x) can be expressed as a Taylor series with remainder 
about the origin, we find the indicial equation for p to be 


(4.3) 4p(p — 1) + 2Np = 0. 
The roots are p = 0, p = — 4N + 1. Corresponding to p = 0 is a solution 
finite and continuous at the origin. The root p = — 4N + 1 leads to a solution 


which is singular at the origin of the order indicated. If the function f(x) is 
analytic, and N is odd, the singular solution has the form of a power series, 
multiplied by x-!#*'. If N is even, N > 4, the roots of the indicial equation 
differ by an integer, and the solution which is singular at the origin will in 
general contain a logarithmic term. If N = 2, the roots are equal, and the 
singularity is that of log x. In all these cases the solution which is singular at the 
origin leads to an elementary solution of the partial differential equation. This 
establishes the converse part of Theorem 1. 

If f(x) = 2N, the fundamental equation is a form of Bessel’s equation of 
order +43(N — 2) (2, p.227). This corresponds to Riemann spaces of the type 
known as simply harmonic. A simply harmonic space of elliptic or normal 
hyperbolic type is necessarily flat (3; 5). It follows that the only elliptic or 
normal hyperbolic equations with elementary solutions of the Bessel function 
type given in (2) are the classical equations Au + cu = 0 with constant co- 
efficients, constant, that is, when Cartesian coordinates are used. In particular, 
if the power I? (pb = — $N + 1) is to be a solution, we must have c = 0; i.e. 
the equation is Au = 0 in a flat space (5). 

Returning to the general case, we see that if c = 0 the fundamental equation 
can be integrated explicitly. The solutions are y = const. and 











78 G. F. D. DUFF 


(4.4) y= cf aoe exp | -- i | (f(r) - ant | dt, 


where C, a, b are constants. The solution finite at the origin is in this case a 
constant, while the other is the Ruse elementary solution (1, p. 118). 


5. A mean value theorem. Consider the (elliptic) equation 
(5.1) Au + cu = 06 
in a completely harmonic space of positive definite metric, and where c is a 
constant. Let Q be an arbitrary point, let x‘ be Riemannian coordinates with 
pole at Q such that 


(5.2) @,=1, ay=0 (1 ¥ 7). 
Then 
N 
(5.3) r=s*(P,Q) = > (x‘)’. 
i=1 
Let @ (a = 1... N—1) be angular coordinates, forming, together with s, 


a system of geodesic polar coordinates with origin at Q. 
The volume element in Riemannian coordinates is 


(5.4) dV =a' dx’... dx", 
where a = la,,| has its previous meaning. Now 
(5.5) dV = ds dS, 


where dS is the surface element on the geodesic sphere of radius s about Q. 
Let dQ denote the element of solid angle in terms of the angle variables # 
(which may be defined from the x‘ exactly as they are in Euclidean space). 
Thus 


(5.6) dx'...dx" = s*~' ds dQ, 
so that 
(5.7 dS = s*~' a‘ da. 


We shall use wy to denote the total solid angle at a point. In view of (2.4) and 
(2.7), we have 


(5.8) a=a(P)= a(Q) exp| fue ~ nytt] . 


so that a is a function of s alone, and not of the angle variables. 
Let v(Q,s) = v(s) denote the mean value 


(5.9) v(s) = +f u(s, 6%) dQ, 
On J 2 
of a function u(P) over the geodesic sphere of radius s about Q. 
THEOREM II. Jf u (P) is a solution of (5.1) in an elliptic completely harmonic 
Riemann space, then v(s), defined by (5.9), is that solution of the fundamental 
equation which is equal to one at the origin. 


nh w Ww 








GEODESIC SOLUTIONS OF DIFFERENTIAL EQUATIONS 79 


To prove this we have (2, p. 411). 


dv 1 (° du 1 (° Ou 

5. =—]}] —da=-—]|] —a, 
( , 10) ds Wy a Os da Wy a on . 
since 

a 0 

= = Vu'n = Vu' Vs = =, 

on Os 
Krom (5.7), 
= dv l , an 1. 
(5.11) ds wya' s*" J. on as, 


where S is the surface of the geodesic sphere. By Green's formula and (5.1): 
* dv l 4 . —C r ; 
(5.12) a if AudV = 5 | u dV, 

ds Wy a’ $ K @y a’ s K 


where K denotes the interior of the sphere. We note that if c = 0 the constancy 
of v(s) follows. Otherwise, we have from (5.5) and (5.12) 


2 ; j . . 
—_ d re] l y / oO; » + 
(5.13) %- Robt lee es | 7 on Oe J u dS. 
ds Wya’s K Wy'S s 
In view of (5.9) and (5.12), (5.13) becomes 
. d’v N—1 @¢ og dv _ 
(5.14) ds” + ( s +3 as ds Tass 


From (2.4) and (2.7) with o = s in this elliptic case, it follows that if we set 
x = T = s*, we find 

(5.15) 4xv" + f(x) v' + cv = 0. 

which is just the fundamental equation. Clearly v(0) has the value unity. This 
completes the proof of Theorem II. 


This mean value property is well known in Euclidean space; we see that it 
holds whenever the partial differential equation has a geodesic solution. In parti- 
cular, the mean value over a geodesic sphere of a harmonic function in a 
completely harmonic space is equal to its value at the centre (5, Theorem 1). 


REFERENCES 


1. E. T. Copson and H. S. Ruse, Harmonic Riemannian spaces, Proc. Roy. Soc. Edinburgh, 
60 (1940), 117-133. 


2. R. Courant and D. Hilbert, Methoden der mathematischen Physik, vol. 2 (Berlin, 1937) 

3. A. Lichnerowicz and A. G. Walker, C.R. Acad. Sci., Paris, 221 (1945), 394-396. 

4. T. Y. Thomas, Differential invariants of generalized spaces (Cambridge, 1934). 

5. T. Y. Thomas and E. W. Titt, On the elementary solution of the linear second order equation, 
J. Math. pures appl. (9), 17 (1939), 217-248. 

6. T. J. Willmore, Mean value theorems in harmonic Riemannian spaces, }. London Math. Soc., 


25 (1950), 54-57. 


University of Toronto 











A CONTRIBUTION TO THE THEORY OF CHROMATIC 
POLYNOMIALS 


W. T. TUTTE 


SUMMARY 

Two polynomials @(G, m) and ¢(G, ) connected with the colourings of a graph G 
or of associated maps are discussed. A result believed to be new is proved for the 
lesser-known polynomial ¢(G, m). Attention is called to some unsolved problems con- 
cerning ¢(G, n) which are natural generalizations of the Four Colour Problem from 
planar graphs to general graphs. A polynomial x(G, x, y) in two variables x and y, 
which can be regarded as generalizing both 0(G, ) and ¢(G, nm) is studied. For a con- 
nected graph x(G, x, y) is defined in terms of the ‘‘spanning"’ trees of G (which include 
every vertex) and in terms of a fixed enumeration of the edges. The invariance of 
x(G, x, y) under a change of this enumeration is apparently a new result about spanning 
trees. It is observed that the theory of spanning trees now links the theory of graph- 
colourings to that of electrical networks. 


1. Introduction. A graph G consists of a set V(G) of elements called 
vertices together with a set E(G) of elements called edges, the two sets having 
no common element. With each edge there are associated either one or two 
vertices called its ends. 

An edge of G is a loop or link according as the number of its ends is 1 or 2. 
For convenience we sometimes say that a link has two distinct ends and a loop 
two equal ends. 

We restrict ourselves to finite graphs, that is graphs for which V(G) and 
E(G) are both finite. 

If V(G) = 0 we must have E(G) = 0 also. 

A graph H is a subgraph of G if V(H) C V(G), E(M) C E(G) and each edge 
of H has the same ends in H as in G. The subgraph H of G is a spanning sub- 
graph of G if V(H) = V(G). The subgraph of G for which V(A) is a given 
subset W of V(G) and E(#) is the set of all edges of G having no end outside 
W, will be denoted by G[W]. 

A sequence (do, A1, @1, As, d2, ..., An, Gn), in which the terms are alternately 
vertices a, and edges A , of G is a path from a, toa, in G if it satisfies the following 
conditions. 

(i) If 1 < i < m the ends of A, are a,_; and ay. 

(ii) If1 < i < mthena;, = a; if and only if A; isa loop. 

It is not required that the terms of the sequence shall be all distinct. If they 
are distinct the path is simple. If the sequence has more than one term and its 
terms are distinct except that a) = a, then the path is circular. 

If x and y are elements of V(G) we say x and y are connected in G if there isa 
path from x to y in G. The relation of connection in G is clearly an equivalence 


Received October 1, 1952. 


SO 











Ay Ss WS 





CHROMATIC POLYNOMIALS 81 


relation. Hence if V(G) is non-null it can be partitioned into disjoint non-null 
subsets V;,..., V, such that two vertices of G are connected in G if and only 
if they belong to the same set V;. The subgraphs G[V;] of G are the components 
of G. Together they include all the edges and vertices of G, and no two of them 
have an edge or vertex in common. We denote the number of components of 
G by po(G). The graph G is connected if po(G) = 0 or 1. The first case arises 
only when V(G) = 0 and E(G) = 0. Clearly each component of a graph is 
connected. 

A connected graph in which there is no circular path is a tree. 

We write ao(G) and a,(G) for the numbers of elements of V(G) and E(G) 
respectively. 

Let Q, be a finite set of nm > 0 elements. Let f be a mapping of V(G) into 
Q,. We call f an n-colouring of G if each edge of G has two ends x and y such that 


f(x) # f(y). We denote the number of -colourings of G, defined in terms of 


Q,, by P(G, n). If V(G) = 0 we take this number to be 1. We observe that 
P(G, n) = 0 if G has a loop. 

P(G, n) is not altered by replacing Q, by another set of m elements. We find 
it convenient to take Q, as the ring of residue classes mod n. 

The function P(G, n) was studied by Hassler Whitney (6; 7). He showed 
that when G is loopless, P(G, m) is a polynomial in n of degree ao(G). For planar 
graphs G the polynomial has been studied in great detail by Birkhoff and 
Lewis (1), who associated it with the dual map of G. Following them we call 
P(G, n) the chromatic polynomial of G. 

The following explicit formula for P(G, ) is due to Hassler Whitney. 


(1) P(G,n) = Zz (—1) Pg, 
3 


The summation is over all spanning subgraphs S of G. We shall find another 
explicit formula in terms of the spanning trees of G valid when G is connected. 
A spanning tree is a spanning subgraph which is a tree. 

At this stage it is convenient to apply some of the concepts of elementary 
combinatorial topology. We orient G by distinguishing one end of each edge A 
as the positive end p(A) and one as the negative end g(A). The positive and 
negative ends coincide if A is a loop but not if A is a link. Ifa € V(G) and A 
€ E(G) we write n(A, a) = 0 if A isa loop or if a is not an end of A. Otherwise 
we write 7(A, a) = 1 or — 1 according as a is the positive or the negative 
end of A. A mapping f of V(G) or E(G) into Q, is a 0-chain or 1-chain respectively 
on G over Qy. 

If V(G) is null we consider that there is just one 0-chain on G over Q,. Similarly 
if Z(G) is null there is just one 1-chain on G over Q,. 

If h is a O-chain on G over Q, its coboundary 6h is the 1|-chain on G over Q, 
satisfying 


(2) (sh)(A) = >> (A, a) h(a) 


for each A € E(G). This may be rewritten as 











82 W. T. TUTTE 


(2a) (3h)(A) = h(p(A)) — h(@(A)). 


If g is a 1-chain on G over Q, its boundary dg is the 0-chain on G over Q, 
satisfying 
(3) (dg) (a) = » n(A, a)g(A) 
for each a € V(G). We call g a 1-cycle on G over Q, if dg = 0, that is (dg)(a) = 0 
for each a. 


2. Colour-coboundaries and colour-cycles. A colour-coboundary or colour- 
cycle on G over Q, is a 1-chain g on G over Q, which is a coboundary or a 
l-cycle respectively and which satisfies g(A) ~ 0 for each A € E(G). 

We denote the numbers of colour-coboundaries and colour-cycles on G over 
Q, by 6(G, m) and ¢(G, m) respectively. These numbers are independent of the 
orientation of G, by (2a) and (3). We consider that 6(G, n) = o(G, n) = 1 if 
G has no edge. 

The colour-coboundaries on G over Q, are the coboundaries of the m-colourings 
of G, by (2a). Another consequence of (2a) is that 54; = dh, for O-chains h, 
and hz on G over Q, if and only if 4:(a@) — he(a) is constant in each component 
of G. Accordingly 
(4) 0(G,n) =n?" P(G, n). 


It follows that 6(G, n) = 0 if G has a loop. The function ¢(G, m) need not 
vanish if G has a loop. Indeed if g is a 1-chain on G over Q, and A is a loop of 
G then the 0-chain dg is independent of g(A), by (3). Hence if G» is the graph 
obtained from G by suppressing the loops, say /(G) in number, we have 


(5) (Go, m) = (n — 1)“ 6G, n). 


However $(G, m) does vanish if G has an isthmus. An edge A of G with 
ends x and y is called an isthmus of G if each path from x to y has A as a term. 
Thus an isthmus is necessarily a link. If G4’ is the graph obtained from G by 
suppressing A we may say that A is an isthmus of G provided x and y belong 
to different components of G,’. An equivalent definition is that A is an isthmus 
of G provided that it is a term of no circular path in G. For if A is a term of 
such a circular path then x and y are clearly connected in G,’. And if a path 
from x to y exists in G4’ the path of this kind with fewest terms is simple and 
can be extended to form a circular path in G having A asa term.! 

We observe that a tree may be defined as a connected graph in which each 
edge is an isthmus. 

The proof that ¢(G, m) vanishes when G has an isthmus A is as follows. 
Let H be the component of G,’ having the end x of A in G as a vertex. Let 
g be any 1-cycle on G over Q,. Then 


> n(B, b)g(B) = 0 


1Our term ‘‘isthmus’’ applies to each of the two kinds of edge for which Kénig uses the 
terms ‘‘Briicke’’ and ‘‘Endkante’”’ (3, pp. 3, 179). 




















CHROMATIC POLYNOMIALS 83 


for each b € V(H), where B runs through E(G), by (3). Summing this over all 
the vertices of H we obtain »(A, x)g(A) = 0. Hence g(A) = 0. Accordingly 
no 1-cycle on G over Q, is a colour-cycle. 

The connection between the function ¢(G, m) and the ordinary theory of 
map-colourings is best seen by considering two dual graphs G and G* on the 
sphere. It may be shown—though we do not prove it here—that ¢(G*, n) = 
6(G, n). Accordingly each of the following unproved propositions is equivalent 
to the famous Four Colour Conjecture. 

(i) 0(G, 4) > 0 if G is a planar graph without a loop. 

(ii) @(G, 4) > 0 if G is a planar graph without an isthmus. 

I wish to draw attention to some unsolved problems related to (ii) but having 
to do with general graphs. They are the problems of proving or disproving the 
following conjectures. 


CONJECTURE I: There exists a positive integer m such that ¢(G,n) > 0 when- 
ever n > mand G has no isthmus. 


CONJECTURE II: 6(G, ) > 0 whenever n > 5 and G has no isthmus. 


Conjecture II is a stronger version of Conjecture I. We cannot replace 5 
by a smaller integer because it can be shown that the Petersen graph (3, p. 194) 
satisfies ¢(G, 4) = 0. 

We prove ¢(G, 4) = 0 for the Petersen graph as follows. If ¢(G, 4) > 0 
then for any orientation of G we can find a colour-cycle g on G over Q,. Let [m] 
denote the residue class of an integer m modulo 4. If a is any vertex of G the 
three residue classes n(A, a)g(A) corresponding to the edges A having a as 
an end are non-zero and sum to zero. Their values must be either [1], [1], and 
[2] or [— 1], [— 1] and [2]. In the first case we call a a positive vertex, in the 
second a megative vertex. The edges A such that g(A) = [1] or [— 1] are there- 
fore the edges of some disjoint circular paths no two of which have a common 
term and which together involve all the vertices. Each of these paths has an 
even number of edges since positive and negative vertices must alternate in it. 
It follows that the edges of G can be arranged in three disjoint classes so that 
each vertex is an end of one member of each class. But it is well known that 
this is not true for the Petersen graph (4). 

We may perhaps regard the following theorem as a very short first step 
towards a verification of Conjecture I. 


THEOREM: If $(G, n) > 0 then (G, n + 1) > 0. 


Proof. In the preceding combinatorial definitions we may replace Q, by the 
ring of ordinary integers, obtaining integral 0-chains, 1-chains, 1-cycles, etc. 

If (G, m) > O there exists a colour-cycle g on G over Q,. It follows that there 
is an integral 1-cycle g’ on G such that g’(A) € g(A) and \g’(A)| < m for each 
A € E(G). This is a consequence of Theorem IV of (5). It is true that that 
theorem is stated only for the case in which G is a simplicial 1-complex, that 











84 W. T. TUTTE 


is a graph without loops and in which no two links have the same two ends, 
but its proof is valid with only trivial modifications in the general case. Now 
for each A € E(G) we have g’(A) # 0 mod (mn + 1). Replacing each integer 
g’(A) by its residue class mod (m + 1) we obtain a colour-cycle on G over 
Qnii- The theorem follows. 

The methods now available for the computation of @(G, m) and ¢(G, m) are 
laborious. They depend on some recursion formulae which we exhibit below. 

If A is an edge of G not a loop we define G,”’ as the graph obtained from G 
by suppressing A and then identifying the ends of A in G to form a single 
vertex /. 

By examining the relationships between the colour-coboundaries, and between 
the colour-cycles, of the three graphs G, G,’ and G,", where A is any edge of 
G not a loop or isthmus, we obtain the identities 


(6) 6(G, nm) = 0(G,',n) — 0(G,4",n), 
(7) o(G,n) = o(G,",n) — o(G,', n). 
For a disconnected graph G with components Gi, . . . , G, we evidently have 

k 

(8) 0(G,n) = [] o(G,,n), 
i=l 
+ 

(9) o(G,n) = [] o(G,, ). 


i=l 


Lemma: If J is a graph in which every edge is an isthmus then every 1-chain 
on J over Q, 1s a coboundary on J. 


Proof. \f a;(J) = 0 this is trivial. Suppose it is true whenever a;(J/) is less 
than some positive integer g. Consider the case a;(/J) = q. 

Let h be any 1-chain on J over Q,. Let h, be the 1-chain on J,’ over Q, such 
that h,(B) = h(B) for each B € E(J) — {A}. By the inductive hypothesis 
h, is the coboundary of a 0-chain f on G,’. Let x be the positive and y the 
negative end of A in J. Let Jy be the component of J,’ of which x (but not y) 
is a vertex. If we replace f(a) by f(a) + s for each a € V(Jo), where s is an 
element of Q,, we shall not alter the coboundary of f in J,4’. We may therefore 
suppose that f(x) — f(y) = 4(A). Then & is the coboundary of f in J. 

The Lemma follows by induction. 

Now consider a graph for which each edge is either a loop or an isthmus. 
Suppose such a graph H has /(H) loops and i(#) isthmuses. We have, as a 
consequence of the Lemma, 


J=0 if (H) > 0, 
(10) 0) = (n — 1 if (H) = 0. 
As a consequence of (5) we have also 
= 0 if i(H) > 0, 











oe = — Fr we = 








CHROMATIC POLYNOMIALS 85 


3. The dichromate of a graph. We now define a function x(G, x, y) of 
two variables x and y, which may be regarded as generalizing both @(G, n) 
and $(G, 2). We call it the dichromate of G. 

If G has no edge we write 
(12) x(G, x,y) = 1. 


If G has an edge and is connected we proceed as follows. First we enumerate 
the edges of G as Ai,..., Am. 

Consider any spanning tree JT of G. Suppose A, is an edge of 7. Then 7',,’ 
has two components, C and D say. Each has one end of A, as a vertex. We say 
A , is internally active in T if each edge A, of G other than A , which has one end a 
vertex of C and one end a vertex of D satisfies k < j. 

Now suppose A , is not an edge of 7. Denote its ends by a and 6. (They may 
not be distinct.) There is a simple path P in T from a to }. There is only one 
such path. For suppose there are two distinct simple paths P; and P, in T 
from a to 6. Then we may suppose some edge A, of T appears in P, but not 
in P,. Let its ends be c and d, c preceding A, in P,;. Then in 7,,’ there are 
paths from d to b, from a to b and from a to c. Hence ¢ and d are vertices of 
the same component of 7,,’. This is impossible since A, is an isthmus of 7. 
We say A, is externally active in T if each A, which is a term of P satisfies 
k<j. 

If A, is an edge of G we write \(7, A,) = 1 or 0 according as A, is or is not 
internally active in 7. We write also u(7T, A,) = 1 or 0 according as A, is or 
is not externally active in T. We call A(7,A,) and yw(T, A,) the internal and 
external activities respectively of A, in T. We denote by r(7) and s(7) the 
numbers of edges of G which are internally and externally active respectively 
in T. 

We define the dichromate of G by the formula 


8) x(G, x, y) = > py FD 
T 


the summation being over all the spanning trees of G. 

We note that at least one spanning tree of G exists. This is proved by Kénig 
(3, p. 60). (Our “‘spanning tree’’ is Kénig’s Geriist). Hence the polynomial on 
the right of (13) is not identically zero. 

To make the above definition significant we must show that x(G, x, y), as 
defined by (13), is independent of the particular enumeration of the edges of 
G which is used. 

To prove this we studv the effect of interchanging the symbols A, and A 4; 
between the two corresponding edges. With respect to the new enumeration let 
\’(T, A;) and yw’ (T, A,;) denote the internal and external activities respectively 
in the spanning tree T of G, of the edge initially denoted by A ,. For each T let 
the interchange of the two symbols replace r(7) and s(T) by r’(T) and s'(T 
respectively. 

The following argument is stated in terms of the initial enumeration 











86 W. T. TUTTE 


First we observe that the interchange leaves A(T, A ,) and u(T, A,) unaltered 
if A, is not A,or A441. Hence 


(14) (T) = r(T) — A(T, Aa) — MT, Ags) + (7, Aa) + (TT, Aad), 


(15) S(T) = s(T) — w(T, Ai) — w(T, Avs) + (T, Ad) + 0 (T, Ani). 


We partition the set of all spanning trees of G into three disjoint classes X, Y, 
and Z as follows. T € X if A; and A,,,; are both edges of 7, T € Y if neither 
A, nor Ay: is an edge of 7, and T € Z if one but not both of A, and A,,; 
is an edge of T. 

If T € X or T € YF it is clear that the internal and external activities in 
T of A, and A 4, are not altered by the interchange. Hence r’/(T) = r (T) and 
s’(T) = s(T) in these cases, by (14) and (15). 

If T € Z let A, be the member of the pair {A;, Ai41} which is an edge of 
T and let A, be the other member. Let the ends of A, be a and b. Let C and D 
be the two components of 7,,’, having a and b respectively as a vertex. Let c 
and d be the ends, not necessarily distinct, of A,. We partition the set Z into 
two disjoint subclasses Z,; and Z; by the following rule: T € Z, if c and d are 
vertices of the same component of T,,’, and T € Zz otherwise. 

If T € Z, the simple path P from c to d in T does not have A, as a term. 
Accordingly the internal and external activities of A, and A, in T are not 
affected by the interchange. So r’(T) = r(T) and s’(T) = s(T) in this case also. 

Suppose T € Z2. Then we may suppose that c is a vertex of C and d isa 
vertex of D. Let o(T) be the spanning subgraph of G obtained by suppressing 
the edge A , and adjoining the edge A,. Clearly o(T) is connected. We show that 
it is a spanning tree of G. For otherwise there is a circular path P in o(T). This 
has A, as a term since it is not a path in 7. This implies that there is a simple 
path from c tod in (¢(T))4,’, that is in T,,’, which is false. Now clearly o(T) € 
Zz and o(¢(T)) = T. We note that A, and A, must be redefined in terms of o(T) 
before the operation ¢ is repeated. 

We deduce that Z: can be partitioned into disjoint pairs of the form {T, 
o(T)} such that A, is an edge of 7. In what follows we take T to be the first 
member of such a pair. 

Suppose first that some edge A,, of G distinct from A, and A ,,, is internally 
active in T but not in o(T). 

Without loss of generality we may suppose A, is an edge of the tree C. Let 
C, and C2 be the two components of C,,’, a being a vertex of C2. Let A, have 
endsa € V(C;) andB € V(C.). Ifc € V(C2) then since A(o(T), A,) = 0 there 
is an edge A, of G such that v > w and which has one end in V(C,) and one 
end in V(C:) or V(D). But then A,, cannot be internally active in 7, contrary 
to its definition. We deduce that c € V(C,). Now since A(o(T), A,) = 0 it 
follows that there is an edge A, of G having one end vy in V(C:) and one end 6 
in V(D) or V(C,), and which satisfies v > w. Actually 6 € V(D) since A(T, Aw) 
= 1. This state of affairs is represented in Figure 1. 











"oa vw + 








CHROMATIC POLYNOMIALS 87 








Figure 2 


Since A(T, Aw) = 1 wehavev > w > i + 1. Hence 
(16) A(T, Ay) = V(T, Ay) = 0, A(O(T), Avi) = AV’ (6(T), Avi) = 0. 


There is a circular path J from a to a in G which has A ,, A y,; and A, as terms. 
Apart from these terms it is made up of three simple paths, one from } to d in 
D, one from c to a in C; and one from 8 to a in C,. It follows that the simple 
paths from c to d in T and from a to 6 in «(T) each have A, as a term. Hence 


(17) w(T, Asa) = w'(T, Aus) = 0, w(o(T), Ay) = w'(e(T), Ay) = 9. 


Suppose next that some edge A, of G distinct from A, and A ;,, is externally 
active in T but not in (7). Let its ends be a and 8. They are not vertices of the 
same tree C or D; otherwise the simple paths from a to 8 in T and «(T) would 
be identical and this would imply u(T, A.) = u(e(T), Ay). Hence we may 
suppose a € V(C) and B € V(D). (See Fig. 2.) 











88 W. T. TUTTE 


Let P, and P, be the simple paths in JT and o(7) respectively from a to 2. 
Then A, is a term of P; and Ay; is a term of P. Since u(7, A,) = 1 we have 
w > i and therefore w > i + 1. Hence formula (16) holds in this case also. 

In P, let a’ be the last vertex of G preceding A; which is a term of P2, and 
let 8’ be the first vertex of G succeeding A; which is a term of P». Clearly a’ « 
V(C) and 6’ € V(D). Let R, and R, be the subsequences of P; and P» respec- 
tively extending from a’ to 6’. There is a circular path J in G formed by taking 
first the terms of R; and then the terms of R:» in reverse order. It has A, and 
A «41 as terms, for they are terms of R; and R: respectively. The subsequences 
of P, and P, extending from a to a’ are identical, since C is a tree. Similarly 
the subsequences of P; and P; extending from §’ to 8 are identical. 

Since u(T, Ay) = 1 and u(o(T), Ay) = 0 there must be an edge A, of G 
which is a term of J and satisfies v > w. Then the simple paths from c to d in T 
and from a to 6 in o(7T) each have A, as a term. Hence formula (17) still 
holds. 

Next we consider the case in which some edge of G is internally or externally 
active in o(7) but not in 7. We first go over to the new enumeration by inter- 
changing the symbols A, and A ,,;. This interchanges J and o(7). The fore- 
going argument shows that (16) and (17) are true in the new enumeration. 
They are relations between the two enumerations. To state them in terms of the 
old enumeration we have merely to interchange the symbols \ and }’, uv and yp’, 
A, and A, and finally T and o(T). But the sets of equations are invariant 
under this operation. 

In all these three cases (16) and (17) are true. Hence by (14) and (15) we 
have r’'(T) = r(T), r'(e(T)) = r(e(T)), (TT) = s(T) and s’(e(T)) = s(e(T)). 

We now consider the remaining case, in which \(7T, A,) = A(o(T), Ay) and 
u(T, Ay) = u(c(T), Aw) for each edge A,, of G other than A; and A ,,;. If there 
is an edge A, of G satisfying v > i + 1 and having ends in both V(C) and 
V(D), then X(T, Ay) = V(T, Ad = A(C(T), Aus) = N(C(T), Avs) = 0. I 
there is no such edge A, we have instead (7, A,) = \’(e(T), Awa) = O and 
A(o(T), Awa) = (7, Ay) = 1. 

There is a circular path J from a to a in G having A, and A 4,4; as terms and 
otherwise consisting of a simple path from a to c in C and another from d to b 
in D. If J has a term A, such that v > 7 + 1, then w(7, Agi) = w'(T, Aus) = 
u(o(T), Ay) = w'(e(T), Ay) = 0. If J has no such term we have instead 
u(T, Agi) = w'(c(T), Ad) = Land w(o(T), Ad = w'(T, Avi) = 0. 

It follows that r’(T) =r(c(T)), r’(e(T)) =r(T), s’(T) = s(e(T)) and 
s’(a(T)) = s(T). 

The foregoing analysis shows that the sum on the right of (13) is not affected 
by the interchange of the symbols A; and A,,;. All that happens is that the 
contributions to the sum of certain pairs of trees are interchanged. But any 
permutation of the symbols A,, . .., A» can be effected by a finite number of 
interchanges of consecutive symbols. Hence the function x(G, x, y) defined by 
(13) is independent of the particular enumeration of edges employed. 





CHROMATIC POLYNOMIALS 89 


We extend the definition of the dichromate to graphs which are not connected 


as follows. If the components of G are G,, . . . , G,, then 
z 

(18) x(G, x,y) = T] x(Gs x, 9). 
i=l 


This is consistent with (12) in the case of an edgeless graph. 
We note some general properties of the dichromate. 


(i) x(G, x, y) ts a polynomial of degree ao(G) — po(G) in x and of degree 
ai(G) — ao(G) + po(G) in y. 


Proof. By a simple induction we find that a graph S in which each edge 
is an isthmus satisfies a;(S) = ao(S) — po(S). A connected graph G has at 
least one spanning tree, and each such tree T satisfies a;(7) = ao(G) — po(G). 

If G is connected and has an edge the theorem follows from (13). For the 
contribution to the sum on the right of (13) of any spanning tree 7 is of degree 
at most a,(7) in x and at most a;(G) — a;(T) in y. By choosing a suitable 
enumeration of the edges of G we can arrange that either of these values is 
attained. The proposition follows in this case. 

We extend it to all G by applying (12) and (18). 


(ii) Jf A is an edge of G not a loop or an isthmus, then 
(19) x(G, x,y) = x(Ga’, x, ¥) + x(Ga", x, 9). 


Proof. ‘This proof depends on the observation that for a connected G the 
spanning trees of G,’ are those spanning trees of G which do not have A as an 
edge, while the spanning trees of G,”’ are the graphs 7,”’ such that T is a 
spanning tree of G having A as an edge. We enumerate the edges of G so that 
A = A,. We obtain corresponding enumerations for G,’ and G,” by rejecting 
A, and then reducing each suffix by 1. With these enumerations each tree not 
having A as an edge makes the same contribution to x(G,4’, x, y) as to x(G, x, y), 
and any other tree 7 makes the same contribution to x(G, x, y) as does T 4” 
to x(G4"", x, y). 

The proposition follows for a connected G. 

If G is not connected let G, be its component having A as an edge. Then 
G,’ and G,” have the same components as G except that G, is replaced by 
(G,)4’ and (G,),4” respectively. Since (19) is true for G, it follows from (18) 
that it is true also for G. 


(iii) Let H be a graph having l(H1) loops, i(H) isthmuses, and no other edge. Then 
(20) x(H, x,y) = x*?y'™™, 


Proof. if H is connected form H, from it by suppressing the loops. Clearly 
H, is the only spanning tree of H. So (20) follows from (12) and (13). Using 
(18) we readily extend the formula to the general case. 

Formulae (19) and (20) provide a method for computing the dichromate 











90 W. T. TUTTE 


of a given graph G. If G has an edge A which is not a loop or an isthmus then 
(19) expresses the dichromate in terms of the dichromates of simpler graphs. 
Otherwise (20) gives the dichromate directly. 

Such computations may sometimes be shortened by using the following 
theorem. 

(iv) Jf G consists of two connected graphs H, and Hy, having just one vertex 
b in common, then 

x(G, x,y) = x(Mi, x, y) x(Ha, x, y). 
To prove this we observe that a subgraph of G is a spanning tree of G if and 


only if it is the union of a spanning tree of H, and a spanning tree of H;. We 
then apply (13). 


Comparing (19) and (20) with (6) and (10), or with (7) and (11), we arrive 
at inductive proofs of the following formulae. 


(21) 6(G, n) 


(—1)*-**" + (G, 1 — n, 0), 
(22) o(G, n) = (—1) O49) (G01 — 2). 


These formulae justify our description of x(G, x, y) as generalizing both 
6(G, n) and ¢(G, n). 

The result that for a connected graph G the sum on the right of (13) is in- 
variant under a change of enumeration of the edges is an interesting theorem 
about the spanning trees of G. As one of its corollaries we have: 


For each enumeration of the edges of G there exist spanning trees T, and T, of G 
such that each edge of T; is internally active in T, and each edge of G not an edge 
of T2 is externally active in T>. 


The number C(G) of spanning trees of a graph G is important in the theory 
of electrical networks in which the conductance of each wire is unity. A summary 
of this theory is given in (2). So the theory of spanning trees provides a link 
between the theory of graph-colourings and the theory of electrical networks. 
The dichromate can be regarded as a generalization of C(G), for we have 


C(G) = x(G, 1, 1). 


C(G) has a simple expression as a determinant, and its properties are well 
known. Perhaps some of them will suggest new properties of the dichromate 
and hence of the chromatic polynomials. 


REFERENCES 


1. G. D. Birkhoff and D. C. Lewis, Chromatic polynomials, Trans. Amer. Math. Soc., 60 
(1946), 355-451. 

2. R. L. Brooks, C. A. B. Smith, A. H. Stone and W. T. Tutte, The dissection of rectangles 
into squares, Duke Math. J., 7 (1940), 312-340 





CHROMATIC POLYNOMIALS 91 


3. Denes Kénig, Theorie der endlichen und unendlichen Graphen (Leipzig, 1936) 

4. J. Petersen, Sur le théoreme de Tait, L'Intermédiaire des mathématiciens, 5, (1898), 225~227. 

5. W. T. Tutte, On the imbedding of linear graphs in surfaces, Proc. London Math. Soc. (2), 
61 (1949), 474-483. 

6. Hassler Whitney, A logical expansion in mathematics, Bull. Amer. Math. Soc., 38 (1932), 
572-579. 

, The coloring of graphs, Ann. Math., 33 (1932), 688-718. 





University of Toronto. 











MAPS OF CERTAIN ALGEBRAIC CURVES INVARIANT 
UNDER CYCLIC INVOLUTIONS OF PERIODS THREE, 
FIVE, AND SEVEN 


W. R. HUTCHERSON anp S. T. GORMSEN 


1. Introduction. In earlier papers (4; 5; 6), certain space curves, invariant 
under cyclic involutions of periods three, five, and seven, have been mapped. 
Lucien Godeaux (2; 3) in 1916 mapped plane cubic curves, invariant under an 
involution of period three, onto a cubic surface in ordinary three-space. Mlle. 
J. Dessart (1) in 1931 mapped plane quintic curves, invariant under an in- 
volution of order five, onto a fifth order surface in a space of four dimensions. 

This paper concerns itself with the mapping of plane septimic curves, in- 
variant under the cyclic involution 

Ky xi: xb: x4 = x1: Exe: E’x; where E' = 1, 
onto a linear space (Ss) of five dimensions. 

The general system of plane curves of order seven is, in general, non-invariant 
under the transformation 7. It can be split up, however, into seven invariant 
curves: , 

(1) DAC; = 0, 


i=@ 
where 


Co = veri + vyxixers + vexixexs + vexieexs + ver + 5K3, 
C, = UeXiX3 + MX aXe a UsX1X3X3 + u sXiXS + UsXox$, 
C, = MoXiXoxs + uU iXiX aX, + UsxXiX> + UX 1X3 + UgX2Xs, 
C;= Uox ix; + UyX1X9Xs + UsxXiXs + U3X 1X 2X3 + U4X3X5, 
CG, = UpXiX 2X + UX 1X2 + UeXiXs + MX 9X5 + U4X5X2, 
C, = UX 1X3 + u xine + UeX1XoX3 aa Usk 9x + UgXox3, 
Ce = uorixe + MiX1Xs Ht Mex XaXs + UX 1X9 + UXSXs. 


The (1,1) correspondence between the ~* curves of Cy and the hyperplanes 
of S; defines the transformation 


. x 2 zt. . we 
(2) ee ee ee oe ee 
xX) X\XoX3 X\XoeX3 X1X2X3 Xe X3 


By eliminating the x,’s from these equations, one gets for the new surface 
F the equations 


(3) 





| |X. Xz Xs XoXs|| _ 
a eo i 


Received September 25, 1952. 

















CURVES INVARIANT UNDER CYCLIC INVOLUTIONS 93 
This surface F is the branch point surface in Ss of the transformation. 


2. Harmonic homology. A careful study of the invariant curves C,, shows 
that the homography 


Q Yi: Vo: Va = Xe: Xe: My 


in the plane containing the involution J;, is a harmonic homology of centre 
A(x. = 0, x; + x3; = 0) and with axis a (x; — x; = 0). This homology trans- 
forms QO, (1,0,0) into O; (0,0,1), O; into O,, and O, (0,1,0) into itself. Further- 
more, the homology also transforms the totality of curves Cy into Co, C; into 
Cs, C2 into Cs, and C; into C,. 

This harmonic homology corresponds in S; to the harmonic homolog) 


o’ Xo _ Xi _X2_ Xs _ Xe _ Xs 
with centre at A’ (X; = X_ = X3 = X, = 0, Xo + Xs = 0) and axis the 


hyperplane a’ (X,» — X; = 0). This homology transforms the surface F into 
itself. 


3. Image curves. We will designate by I) the hyperplane sections of F, 
which correspond to the curves C». Likewise, T;, Ts, Ts, Ts, I's, and I's are 
the curves on F, which correspond, respectively, to the curves C;, C2, C3, Cu, 
Cs, and Cz. 

By the indicated homology, then, these curves are transformed as follows: 
I’) goes into itself, T', into I's, I’; into I's, and T; into Ty. 


It is observed that any curve from the system C, (i = 1,2,..., 6) intersects 
a curve from C;, in forty-nine points, forming seven groups of the involution /;. 
It follows, therefore, that T', (¢ = 1, 2, . . . 6) will intersect Ty in seven points, 


making the curves I’, of order seven. 
The equations of I; are 
||X1 X_2 Xs XoXs (—usXs) = 
|X: X3 é ¢ xX; (uoX, + UX » + U2X 3 + U3X 4) | | ; 


In the equations of the remaining I’ ,, the matrices differ only in the last column. 
Fors = 2,..., 6 the last columns are, respectively : 

(—u4X5), (uoX, ~~ UX 2 + UX 3 + U3X 5); (—u,X 0X, a UX 9 ¢ 5 
(usXi + waXiX 2 + woX 0X1); (—usXiX5 — wXoX5),(woXT + wiX Xs + weX Xs); 


(u X ) ), (uoX o _ UX ; + U3X 2 a u4X 3 iP (—uoXo). (um X, + U2X » ~~ UsX3 4 uyX 4 R 


4. Branch point O,’. To point O; on the plane corresponds on F the point 
O,'(1,0,0,0,0,0). The singularities of this point will now be studied. Consider 
the family (@*) of curves from the system Cy, passing through the invariant 
point O,. The equation for this family of curves is 











94 W. R. HUTCHERSON AND S. T. GORMSEN 


(4) Dieixers + vaxixers + verixers + verd + vers = 0. 

Applying to equations (4) the quadratic transformation 
U x: X25 = 2h: ByS2: BeBe 
and simplifying, we get 
(5) 21 (123 + v22025 + Vst2e3 + veze) + v9e223 = O. 

This shows that to the point Oy (the first order neighbourhood of O; on 
x3 = 0), corresponds the triple point (z2 = z; = 0) for the curves (5). 

In order to obtain the points, infinitely near Op '(1,0,0,0,0,0) on F, which 
correspond to the points infinitely near the point (z. = z; = 0), it is necessary, 


first, to project the surface F from the point O»’ onto the hyperplane X»_ = 0. 
This gives the independent equations 


F, XiX; = Xi, X2X,= Xi, Xo = 0, 
and the dependent equation 
XX = XX3. 


It is noted that the plane X, = X_. = X; = Osatisfies the independent equations, 
but not the dependent equation. Thus F; must be a cubic surface (1). 

Second, apply the transformation U to the transformation (2) in which X_ = 
0 and obtain the simplified expression 


Xi_ =X:  jX; Xy_ Xz 
223 «218283 «1Z3Zg CSO 
Since one is interested in approaching the point (z. = z; = 0) from all 
directions, let 2; = kz, and substitute in the last equations. Let 2, approach 
zero, which implies that X; = 0. Eliminating k from the resulting equations, 
one obtains the cubic cone 


(6) XiX3 = Xi, X2X,=Xj, Xs =0. 
This cone intersects F; in a twisted cubic curve 
(v1) XiX, = Xi, X2X.=Xi, Xo = Xs =0. 


This shows that the points of the first order neighbourhood of the point (z2 = 
z; = 0), as well as the points in the domain of the first order neighbourhood 
of Oj, correspond projectively to points of this cubic curve. The points of 
this curve are projections of the points infinitely near Oo’ and on the cubic 
tangent cone to F at the point O,’. 

Applying now the quadratic transformation 
V x: 2: xh = 21: Ses: 2183 
three times, successively, to the curves (4), one gets 
(7) 2i' (vite + sts) + veei'2223 + vseiezes + ves = 0. 

This shows that to Oj333 in the third order neighbourhood of O; on x: = 0 








— " 








CURVES INVARIANT UNDER CYCLIC INVOLUTIONS 95 


corresponds simply the point (z. = z; = 0). Thus, the curves (4) pass simply 
through O13, 0433, and Oi333 on the line x2.= 0. 

Transforming the curves (7) of the z-plane to S; by (2) with X» = 0, we 
obtain 


Xi X» 443 Xs. Xs 
“a = “138 = TI T= 7. 
2 22 21 2223 212223 2223 21 23 


Again substitute 2; = kz, and allow 22 to approach zero, obtaining 
(8) Xs = RX, Xo = X3 = Xi, = 0. 

It follows, therefore, that to the points infinitely near the point O33 cor- 
respond projectively on the surface F; the points on the straight line 
(a;) Xo = X_o= X3 =X, = 0. 

This also means that the points infinitely near 0,333 correspond to the points 
infinitely near the point O,’ on F and lying in the plane 
(9) Xo = X3 = Xs = 0. 

This plane is tangent to F at the point Oy’. 

We have now established the fact that to the invariant point O, of the in- 
volution J;, corresponds on the surface F a branch point Oo’, which is a quad- 
ruple point. The quartic tangent cone at this point has degenerated into a 
cubic tangent cone (6) and the tangent plane (9). 

Moreover, the cubic curve y, and the straight line a; have in common only 
the point 

Xo =X, =X, = X, = X; = O, 
which is designated by O,’(0,1,0,0,0,0). 
The cone (6) and the plane (9) have in common only the straight line 


(10) Xs = X3 = X4 ™= Xs == 0. 
Hence, the tangent plane is also tangent to the cubic cone along this triple 


line. 


5. Images of curves C,; at O,’. The curves of the system C, have triple 
points at the invariant point O,. Each branch is tangent to the invariant 
direction x; = 0. 

Applying the transformation U to the curves C, gives 

7 : 2 2 3 46 
(11) 21 (mors + 12023 + UoS2Z3 + M322) + UgZ223 = O. 

These curves have a triple point at z. = 2; = 0, and the tangents to the 

curves at this point have the equations 
ots + uy2223 + U2t323 + uz, = 0. 


Thus, the point Oj, in the first order neighbourhood of Oj, is triple. Hence, 
there are three simple variable points in the first order neighbourhood of Oj:. 











96 W. R. HUTCHERSON AND S. T. GORMSEN 


To approach the point (2 = z; = 0) along the curves (11), one substitutes 
Z; = kz, into their equation and allows z: to approach zero. It follows that 


2i(uok't: + uk: + uskes + user) + ugk'sr’ = 0, 
or 
zi(uok® + uk” + wok + uz) + usk’s, = 0, 
and hence 
(12) uok® + uyk® + uk + us = 0. 


One has learned earlier that the points in the first order neighbourhood of 
O12 project into the points of the twisted cubic curve y;. Any one member of the 
system C, has three points in the first order neighbourhood of O,2, and their 
projections on F, are therefore on the twisted cubic curve 7;. The three values 
for k, found by solving equation (12), will locate the three points on ¥;. 

Assume that the roots of (12) are k’, k”, and k’”’. The three points then have 
the coordinates: 


(Q)) (0, k”*, k”, k’, 1,0) 
(Q») (0, yy. ye Rp” 1, 0) 
(Qs) ar 7 vy’ 1.0. 


So the equation of the plane, 010203, can now be written as 
UoX 1 _ UX 2 UX 3 + U3X 4 = 0, Xo = Xs = (). 


These results show that the images, designated by I,, of the curves C, 
mapped upon the surface F, have a triple point at Oy’, and the tangents to the 
curves at this point are the intersections of the cubic tangent cone (6) with the 
hyperplane 

UpX 1 + UyXo + U2X3 + U3X, = O. 


When the curves [ are projected from Op,’ upon the cubic surface F;, the 
equations for the curves, now designated by T;’, become 


| \X, Xs X; (—u4X5) 
Xs X3 Xs (uoX, + UX » + UX 3 + U3X 4) 
Xo = 0, 


= 0, 


and are of order four. 
‘ The same general procedure is then applied to the remaining systems of 
curves Cz, C3, Cs, Cs, and Cs to complete the study of the behavior of these 
curves at the branch point O,’. 

The existing harmonic homology permits one then to deduce the behavior 
of the same curves at the branch point O,’ (0,0,0,0,0,1). 

To analyse the systems of curves C; (i = 0, 1, 2,... , 6) at the branch point 
O,' (0,0,0,0,1,0), it becomes necessary to project the surface F onto the hyper- 





int 
er- 


CURVES INVARIANT UNDER CYCLIC INVOLUTIONS 97 


plane X, = 0 from the point O,’. New quadratic transformations must be used, 
and they are 


W, Xi: Xo: Xy = 2323: Ze: 2223 
and 
| a X11: Xe: Xe = BySe: Se: 2Zs. 


6. Summary. The results of this paper show that the image of a plane 
cyclic involution of period seven may be taken as a surface of order seven in a 
linear space of five dimensions. The surface has two quadruple branch points, 
whose tangent cones are formed by a cubic cone and a plane. The surface has 
also a third branch point, which is a binode infinitely near to two binodes not 
on the surface. 

There exist on the surface six linear systems (@*) of twisted septimic curves. 
One system, 1, passes triply through one of the quadruple points, with the 
three tangent lines lying on the cubic cone; I’; also passes simply through the 
other quadruple point, with the tangent lying in the tangent plane at that 
point. Finally, [, passes simply through the binode, with its tangent line in 
one of the two tangent planes at this point. 

A second system, I's, has the same characteristics as IT',, only the roles of 
the two quadruple points and the two tangent planes at the binode are 
reversed. 

A third system, I2, passes triply through one of the two quadruple points, 
with two tangent lines on the cubic cone and one on the tangent plane. It 
passes simply through the other quadruple point, with its tangent line on the 
cubic cone at that point. Finally, T', passes simply through the binode, with 
its tangent line the line of intersection of the two tangent planes at that point. 

A fourth system, I';, has results analogous to [':, except that the roles of the 
two quadruple points are reversed. 

A fifth system, I';, passes doubly through one of the two quadruple points, 
with the two tangent lines lying on the cubic cone. It passes doubly through the 
other quadruple point, with one tangent line on the cubic cone and the other 
on the tangent plane. Finally, I's passes simply through the binode. It has 
for its tangent line the line of intersection of the two tangent planes at this 
point. 

A sixth and final system I, has the same properties as [;, with the roles of 
the two quadruple points reversed. 

When the systems of curves, I’; and Is, are projected from one of the two 
quadruple points onto a hyperplane, two new systems result, which are twisted 
quartic curves. 

When the systems I, and I; are projected from the binode, and when the 
systems I; and I, are projected from the line connecting this point with its 
adjacent binode, O;’, both onto respective hyperplanes, twisted quintic curves 
result. 











98 W. R. HUTCHERSON AND S. T. GORMSEN 


REFERENCES 


1. J. Dessart, Sur les surfaces représentant l'involution engendrée par une homographie de 
periode cing du plan, Mem. Soc. Royale des Sciences de Liége (3), 17 (1931), 1-23. 

2. M. L. Godeaux, Etude &émentaire sur l'homographie plane de période trois et sur une surface 
cubique, Nouv. Ann. Math. (4), 16 (1916), 49-61. 

, Sur les homographies planes cycliques, Mem. Soc. Royale des Sciences de Liége, 
15 (1930), 1-26. 

4. W. R. Hutcherson, A cyclic involution of order seven, Bull. Amer. Math. Soc., 40 (1934), 
143-151. 


3. 











5. , Maps of certain cyclic involutions on two dimensional carriers, Bull. Amer. Math. 
Soc., 37 (1931), 759-765. 
6. , Third order involution contained on a certain seventh degree surface (Abstract), 


Amer. Math. Monthly, 56 (1949), 586-587. 


University of Florida 





i a ga -~— {ff 








PARALLEL CURVES 
G. P. HENDERSON 


In the Euclidean plane a curve C has a one-parameter family of parallel 
involutes and a unique evolute C* which coincides with the locus of the centres 
of the osculating circles of C. If C is parallel to C, C* is also the evolute of C. 

We will study parallel curves in n-dimensional Euclidean space and obtain 
generalizations of the properties given above. 


DEFINITION. Curves C and € are parallel if there is a one-to-one corres- 
pondence between their points such that the tangents at corresponding points 


are parallel and such that the join of corresponding points is perpendicular 
to the tangents. 


This definition was given by Da Cunha (1). 

It follows at once that parallelism is an equivalence relation. 

We denote the position vector of a point on a curve C in n-space by r. We 
suppose that r is an (m + 1)-times differentiable function of the arc length s of C. 
Let C have the moving m-hedron £,, .. . , & and non-vanishing curvatures 
ky, . . . , Ry-1- We use corresponding notations for curves C, C, etc. 

If C and C are parallel, the distance between corresponding points is constant 
as we see by differentiating (7 — r)’. 

To find the curves C parallel to a given curve C we put 


(1) fart Dud, 
where 1, ... , %, are scalar functions of s to be determined. Since (7 — r)=,; = 0, 
u, = 0. Differentiating (1) and using the Frenet formulae 
(2) Ef = — Renken + kbs > ee n), 
in which kyo = k, = 0,k, > 0, (4 = 1,...,” — 1), we obtain 

E,8’ = (1 — kyus)éi + p> (uy) — Rebegs + Reaee-rdbs. 
Since 
(3) E, = ef; (e= +1), 
we have 
(4) B’ = e(1 — Rye) 


Received November 1, 1951; in revised form May 22,1952. In the first version of this 
paper only the case m = 4 was considered; most of the results being from a Ph.D. thesis 
written under the direction of Professor H. S. M. Coxeter. The referee suggested that the 
theorems be generalized and stated many of the results that could be proved. I take this 
opportunity to thank him for his assistance. 


99 











100 G. P. HENDERSON 


and 
(5) u,’ = Rie + Rthiys (i ot Trtri.. 


These differential equations determine an (m — 1)-parameter family F,_, 
of parallel curves. We see that there is exactly one C through every point of the 
common normal (m — 1)-space H,_,(s). 

We assume that 3 is defined so that # > 0. Differentiating (3) we find 


(6) E,=,, ka’ = k, (ij =1,...,m). 


In connection with these equations for k,, it should be mentioned that if the 
curvatures never vanish, the sense of the vectors of the moving n-hedron of a 
curve will be chosen so that the curvatures are positive. 

Let Cy : r, = r(s) be a set of curves parallel to C. Since the distances |r, — r,| 
are constant, the figure consisting of the points r, will move rigidly as s varies. 
If a subfamily F, of F,_: intersects say H,_:(se) in a linear p-space then F, 
intersects every H,_,(s) in a linear p-space H,(s). Thus the concept of linear 
dependence can be applied to parallel curves. 


LemMA 1. Let 


Cy ry=rt DY wih G=1,...,p), 


i=1 


be p curves in F,_;. C, Ci, ..., C, span an F, tf and only tf the Wronskian of 
uni, . . . » Usp does not vanish. 

Proof. The equations (5) imply that the u2; are (m — 1)-times continuously 
differentiable solutions of a linear homogeneous differential equation of order 
n — 1 with continuous coefficients. Hence if the Wronskian vanishes, the u2, 
are linearly dependent (2, p. 116). That is, there exist constants a,, not all 
zero, such that 


Dp 
(7) Dd ura; = 0, 
1 


and since (r; — r)’ = — kytto¢é1, we have 


Dp 
> arr; —r)'’=0 
1 


and 
Dp 

(8) ys a,(r; — r) = ro = aconstant vector. 
1 


Since (7, — r)t: = 0, ro&: = 0; so that if ro + 0, C is in an (m —1)-space which 
contradicts k,, ~ 0. Hence r, = 0 and the r; — r are dependent. On the other 
hand (7) can be obtained from (8) by differentiating. 


LemMMA 2. The curves C, Ci, ..., C, of Lemma 1 span an F, if and only if 
the determinant |u ,,| #0(4¢=2,...,po+1:5j=1,...,p). 


Proof. Using (5) we find that k-' ky?-* . . . ky|u,,)| is the Wronskian of 2, 
« » Udp.- 








ft 


Pr 


bly 





PARALLEL CURVES 101 


Lemma 3. If Cis ona hypersphere, every curve C parallel to C is on a concentric 
hypersphere. 


Proof. Let ro be the centre of the hypersphere on which C lies. Then 
(r — ro)ti: = O and (7 — ro)E, = €(F — r)&, + e(r — ro)t; = 0. Thus C is on 
a hypersphere with centre ro. 


DEFINITION.’ C is a pth involute of C*, and C*, is a pth evolute of C if C 
is an orthogonal trajectory of the osculating p-spaces of C*,(p = 1,...,” — 1). 


THEOREM 1. The pth involutes of a curve form an F,. 


Proof. Let C:# = #(s) be a pth involute of C. We can write 


P 
F=r+ dL aks 
im 


in which the a, are to be determined so that £,¢, = 0 (i = 1,..., »). These 
conditions are satisfied if and only if 


(9) a,’ = kya, — I, a, = Rigs — Render, a,’ s— Ry—10y—1 


(¢=2,...,p-— 1), 
and when the a, are chosen in this way, a, does not vanish identically and £, = 
+ § 4: whenever a, + 0. 

Let Cy) and Cis, be pth involutes of C. £1) is parallel to £42) since each is 
parallel to £,,, and 

(Fay — Fo») = + Fey — Fe»)Eer = 0. 
Thus Cy and C, are parallel. 

Since (9) is a system of linear non-homogeneous differential equations for 
the a,, we can determine 7,1), . . . , 7¢941) so that 7a) — Fi» (§ = 2,...,p4+ 1) 
are independent. Then if 7 is any other pth involute 7) — 7, 74) — 7» are 
dependent. Thus the pth involutes form an F,. 

Next we find some necessary conditions in order that C*, shall be a pth 
evolute of C. Let S*,(s) be the osculating p-space of C*,. Put 


(10) ro=rt p> bey 


The 5, are to be determined so that r*, — r is in S*, and so that £, is orthogonal 
to S*,. We see that b, = 0 and differentiating (10) and using r*,‘°&, = 0 (i = 
1,..., p) we obtain 


(11) by = Cy (@=1,...,p+1), 
where ¢1, . . . , c, are defined by 
(12) 4, = 0, Rico = 1, Cf = — Reese + Rew (¢=2,...," — 1). 


‘At first only the cases nm = 4, p = 1 and 3 were considered. The referee suggested that 
involutes and evolutes of other orders be introduced. 








102 G. P. HENDERSON 


Hence 6y, . . . , bp4: are known. We will show later that the remaining 5, can be 
determined so that C*, is a pth evolute (p # m — 1 if C is on a hypersphere). 


THEOREM 2. In general the curves of an F, have exactly one common pth 
evolute C*,. There is an exception if and only if the members of F, are on concentric 
hyperspheres whose common centre lies on all the H,(s). In this case there is no 
common pth evolute. 


Proof. Let C,Ci,...,C, span F,. If C*, is a common pth evolute of these 
curves, then C*, is a pth evolute of every member of F,. For r*, — r, r*, — ry, 
(¢ = 1,..., p) are in S*, and &; is orthogonal to S*,. If 


P 


F=r+ > adr; — 1) 


1 
is any other curve in F,, r* — # is in S*, and £, is orthogonal to S*,. Thus as far 


as common pth evolutes are concerned we can replace F, by C, Ci, ..., C. 
Since r*, — 7, *,-—1r,(7=1,..., pb) are dependent and the r, — r are 
independent we can write 
Pp 
(13) Fe =r+ > A,r, — 1). 
1 
Putting 


r,=rt+ De Haske 


we have 


P n 
r, =r+ > 7 Us sri- 
j=l im? 


But C*, is a pth evolute of C. Hence by (11), (r*, — ni; =e, (GG = 1,..., 
p+ 1). Thus 


id 
(14) Dd AMtas = C4 (Gj =2,...,p +1). 
j=l 


By Lemma 2, the determinant |e 441 is not zero so these equations determine 
A, uniquely. Thus there is not more than one common th evolute of the curves 
of F, and if there is one it is the curve C*, given by (13) and (14). 

Suppose now, first, that the vectors r*,“® (i = 1, . .., p) are linearly in- 
dependent. We then prove that C*, actually is a common pth evolute. Differ- 
entiating (13) and using r*,(°, = 0, we obtain 

i 
re? = > x5°G, — 1), (i =1,...,). 
j=l 


Since the r*,“° are independent, we can solve these equations for the vectors 
r; — r in terms of r*,“®. Now writing 


Pp 
r,=rt+(r;—r) = r, — ; is Ari —7r) + (73-17) = r, + a vector in c.. 
i=1 


we see that the point 7, is in S*,. Since we also have £, perpendicular to S*,, 
C*, is the common pth evolute. 








PARALLEL CURVES 103 


Suppose next that the vectors r*,‘” are dependent so that C*, is less than 
p-dimensional. We can write 
. e * 
r Pa = Lider —, 
f= 


and differentiating this, 


r” orn in F ar, + d.r (™. 
tml 
Since r*,°°§; = 0 (@ = 1,..., p), r*,%*&, = 0. Further differentiations yield 
r* ©, = 0 (§ =1,..., m). When we differentiate r*,“°§, = 0 we obtain 
r*& = 0 (¢ = 1,...,2 — 1). Continuing this, we have r*,’§; = 0 (j = 1, 


. « » m); hence r*,’ = 0 and C*, reduces to a point. Thus there is no common 
pth evolute. By (13), r*, is on H,(s) and since (r*, — r)&, = 0, C is on a hyper- 
sphere with centre r*,. 

Finally we show that if C is on a hypersphere with centre r» and if ro is in 
H,(s), there is no common pth evolute. We can write 


p 
ro=rt+ Doudri—7), rit =0 (j=1,...,p). 
1 


But these are the conditions which determine r*, and A,. Thus r*, = ro, and 
the result follows. 

We observe that there is a (1 — 1) correspondence between pth evolutes of 
C and p-spaces in H,_,;(s) through r. 


THEOREM 3. The hypersphere with centre r*, and radius \r*, —r\ has at 
least (b + 1)th order contact with C at r. 


Proof. The points of intersection of C and the hypersphere with centre 
r*,(so) and radius \r*,(so) aa r(se)| are obtained by solving the equation 


f(s) = [r(s) — r°,(s0)]* — [r(s0) — r°p(s0)]* = 0 
for s. We find 
f(s) = 2[r",(s) — r p(so)]r°(s) (i =1,..., p +1) 
Therefore 
f°? (so) = 0 (Gj=1,...,p +1). 


Consider the (m — 1)th evolute C*,_; = C*. Assuming that C is not on a 
hypersphere, C* is the (m — 1)th evolute of every member of F,_; and by 
Theorem 3, C* is the locus of the centres of the osculating hyperspheres of 
every curve in F,_,. Thus the family of parallel curves has a common locus 
of centres of osculating hyperspheres. (This is true even in C if on a hypersphere). 

Next we obtain the relationship between the moving n-hedrons of C and C*. 
Since 


r =r+ Dd cés 
2 


r®’ = (c', + Ry_iCa_1)én, and we see that C is on a hypersphere if and only if 








104 G. P. HENDERSON 


C'n + Ra—1Cn—1 = 0. We will assume that this expression never vanishes and that 
s* is defined so that s*’ <0. Then s*’ = (c’, + Ry_iCn_i1)e* and &; = e*E,, 
where e* = + 1. Further differentiations yield 


* * * . 
(15) g = € Enti1-14 k 7s’= — ky: (= ar * 
DEFINITION. The pth polar developable D, of Fx.1 (bp = 1,...,"— 1) is 
the surface 
p+l n 
(16) %Z=rt Dickit DL veo 
| p+2 
in which s, yp+2, ... , ¥, are parameters. 


A particular curve C of F,_; has been used in this definition. In order to 
justify this we will prove that the same surface is obtained if we use any other 
curve 


C:F=r+ > ud: 
I 


of F,_;. We now have 


Z=fF+ > @é+ > ge: 
1 p+2 
p+1 n 
=rt > (eit udiit Do (Gi + udls 
1 pr2 
Using (4), (5), (6) and (12) we obtain @€,+u,;=c, (t#=1,..., n) so that 


Zp = 2p. 


THEOREM 4. (a) The pth evolutes of a member of F,, are on D,. 

(b) Dy: is the envelope of the (n — p — 1)-spaces which generate D, (p = 1, 
ee 

(c) If Cis not on a hypersphere, D, is generated by the (n — p — 1)-dimensional 
osculating spaces of C* = D,_1. 

(d) If F, has a pth evolute, this evolute is the locus of the point of intersection 
of H,(s) and the corresponding generator of D,. 

(e) The first evolutes of the curves in F,_, are geodesics on D,. 

(f) If x = x(s) is a geodesic on D, and is not a straight line, x is a first evolute 
of some member of F,_1. 


Proof. (a) Compare (10), (11) and (16). 

(b) The equations of a generator of D, are (R— ris =e, (G@=1,..., 
pb + 1). Differentiating these, we see that there is an envelope and that it is 
Dy. 

(c) The (” — p — 1)-dimensional osculating spaces of C* are 


n—p—1 


R=rt+ Dy ik: 


1 


in which the y*,; are parameters. This is 














PARALLEL CURVES 105 


R=r+ > cds + we ane bt 
1 p+? 
using (15). When we put 
a Ty q = Fe ((=p+2,...,%), 


R becomes 2,. 


(d) Let C, Ci, ..., C, span F, and let the point of intersection of H,(s) 
and the generator be R. Since R is in H,, 


Dv 
R=r+ > urs — 1). 
1 


The generator is (R — r)§; = c; (j = 1,..., 0+ 1). The unique solution of 
these equations is u; = A, R = r*,. 

(e) The principal normal of C*, is + & which is normal to D;. Hence C*, 
is a geodesic. 

(f) We will show that every first involute of x = x(s) is parallel to C. Let 
y = y(s) be a first involute of x. Since x is a geodesic, fa.) = + & and since 
y is a first involute of x, Ey) = + a2). Thus £1) is parallel to &;. Also (y — r)é; 
= (y — x)&; + (x — r)é,. The first term of this is zero because y — x is parallel 
to £1.) and the second term is zero because x is on D,. Hence y = y(s) is parallel 
to C. 

Next we want to develop D,; on an (m — 1)-space and determine the point of 
the (m — 1)-space which corresponds to (s, ys, . . . , Ya) Of Dy. Since D, is the 
tangent hypersurface of C*,_, (provided C is not on a hypersphere) we will 
first see how to develop the tangent hypersurface of a given curve C on an 
(n — 1)-space H. Vectors in H will be denoted by capital letters. 

C will roll along a curve R = R(s) in H and R will have arc length s. The 
point 


z=rt+ > vd. 
1 


of the tangent hypersurface is mapped on 
Z=R+ Dod 
1 


where (7), . . . , Z,—-1) is the moving (m — 1)-hedron of R. Now using the fact 
that the line element is invariant under this transformation we find that & 
for R is equal to k, for C (i = 1,...,m — 2). 
Turning now to the first polar developable of C, the point 
n—2 


ser td sb cme tice t Dd (rte y ati-dbs 
1 ; 


of D; corresponds to 
Z= R® + pm ys a 


1 
where R* and 7*, are determined by 











106 G. P. HENDERSON 





d d a... 
a =T",, a = Rath Tig, Sete -— Beals 
(= 1 ,n — 2) 
Now 


R* = f T’,ds* = f €* (Cy—thn-1 + cn) T'ids 
and after integrating by parts m — 1 times we obtain 
Sees gan 
Let us put ; 
se =T7T, c;+ po =y, (¢=1,...,.n—1;7 =3,..., n). 


The point 
reste t+ Do yk 


of D, corresponds to 


(17) €2T; + > ¥iT gs 
3 
where 7), . . . , 7,—: are solutions in H of 
(18) Ty’ = k2T>, T; =—kTyit+ ResrT tas (7 m= 2,...,%— 1). 


We have assumed above that C is not on a hypersphere. However, once 
we have (17) and (18) we can easily verify, by comparing line elements, that 
they are correct in this case also. 

Let A be an arbitrary constant vector in H. The general solution of the dif- 
ferential equations (5) is 


Wiss = AT, (¢@=1,...,#—1). 


Thus we have a (1 — 1) correspondence between points of H and curves of 
F,_;. The geometrical significance of the point A which corresponds to C of 
F,-1 is that if the whole n-space is moved rigidly when we are developing D,, 


C cuts H in the fixed point A. This follows from the fact that the point of 
intersection of C and the tangent (m — 1)-space of D, is 


r + ym Us; 


and since T; = e*7*,_1 = e*&*,_1 = £2, the transform of this point is 


LwToa= DL (ATA) Ti = A. 


9 
- 


THEOREM 5. When D, is developed on H, the pth evolutes of curves of Fy-1 
become p-dimensional curves in H. 


Proof. We may assume that the pth evolute under consideration is the 
common pth evolute C*, of Ci,..., Cyii1 where Ci,..., Cys: are parallel 
curves which span an F,. Let C; correspond to the point A, of H. 





: 


of 


he 
lel 


PARALLEL CURVES 107 


p+i n 
r= + p> Ary — nn) =r + » > (A:T j-1)é; 
j= 


p+il «a 
+ » D Ah (Ae — Ar) Tyalés, 
t=—2 j=? 
which transforms to 
7 p+l sn 
R’, = D> (AiT;-1) T 5-1 + > > Ail (A, —_ A;) T ;-1} Ty 
i—2 j= 
p+ 
=A,+ » A(Ay — Ai). 
Thus the transform is in the p-space determined by A,, ... , Apss. (If the A, 
were in a (p — 1)-space, C;, . . . , C,41 would not span an F,). If R*, were in a 
(p — 1)-space the determinant |d | (i =2,...,p+1;j=1,..., ) would 


be identically zero; the vectors r*,‘” would be dependent, and C; 
would have no common pth evolute. 

We now have a (1 — 1) correspondence between pth evolutes of curves of 
F,1 and p-spaces in H. 

Consider the case p = 1. The transform of the common first evolute of C, 
and C; is R*,; = A; + A2(Az2 — A;), the straight line through A, and Az. 
C*, actually corresponds to a segment of this line. The segment includes A, 
if and only if A, = 0 for some s. The family of first evolutes of C, corresponds 
to the family of straight lines through A,. Thus we have the following theorem. 

THEOREM 6. The family of geodesics (which are not straight lines) through a 
point of D, is the family of first evolutes of some curve of F,-1. 

THEOREM 7. The angle between the tangents at corresponding points of two 
first evolutes of a given curve is constant. This angle is equal to the angle between 
the lines in H which correspond to the two first evolutes. 

Proof. Let the given curve be C and let C*y» (¢ = 1, 2) be the common 
first evolute of C and C;. Since the tangent of C*» is parallel to r; — r, the 
cosine of the angle between the tangents is 


n—l n—1 
(i= NGe= 9) _ b Tatu S (asridén | 
_ 2 n—1 7 a—I a 
em ATS (asraten] [ SAT DE] 
n—1 “a1 : st 
p> (a7) (4sT.) / = (A:T) 2 (aT) 


A,A./V/ AjAi. 


,* * * % Co+1 











REFERENCES 


1. P. J. Da Cunha, Du parallelisme dans l'espace Euclidien, Portugaliae Math., 2 (1941), 
181-246. 


2. E. L. Ince, Ordinary differential equations (London, 1927). 


University of Western Ontario 











THE STEINITZ-GROSS THEOREM ON SUMS OF 
VECTORS 


F. A. BEHREND 
1. Introduction. a), a2, ..., a, are n-dimensional vectors, 


Ld 
> a = 0, lae| <1 (lew); 
r=1 


they are arranged to form a closed polygon 

OAsAz...A,y-10 (OA; = a1,..., Az-tAe = ae, ..., AiO = ay). 
Denote by R(a:, a2, . . . , a) the radius of the smallest circumscribed hyper- 
sphere with centre at O; by R(ai, ae, .. . , a») the minimum of 


Rea, Gags + « « 9 Be y~s» Mp) 


for all possible reorderings 


ee 
of as, . . . , @_1; and by ¢, the least possible constant such that 
R(a, Gas... 4G 


for all possible choices of p and a, ao, . . ., &p. 

Steinitz (1) proved that c, < 2(m + 1); using induction with respect to n, 
Gross (2) obtained the weaker estimate c, < 2" — 1; by the same method 
Bergstrém (3) obtained the result c,? < 4c,_:°7 + 1. Trivially, c, = 1. ce = V2 
was proved independently by Gross (2), Bergstrém (4), and Damsteeg and 
Halperin (5). For 2 > 3 the exact values of c, are not known; from Bergstrém’s 
estimate it follows that c; < 3, cs < +/37; for nm > 5, Steinitz’s estimate gives 
the best result. 

By a refinement of Steinitz’s original method it will be shown in this paper 
that, for > 3, c, < _ (Theorem 1), and particularly, c; < (5+ 2./3)! = 
2.90 .. . (Theorem 2). 

The lower estimate c, > }(n + 6)* given by Damsteeg and Halperin (5), 
and other examples make it likely that the true order of c, is 4 


2. Notation. Greek letters except «x, A, u, v, e denote n-dimensional vectors 
(n > 3); a,b, c,d, e,f, g, x, y, 2 real numbers; 7, 7, k, 1, m, n, p, g, 7, S, ty K, A, Ms 
v, « natural numbers. 

lar| denotes the length of a; a6 the scalar product of a, 8. 

The vectors 6;, 92, . . ., 9m will be called positively dependent (p.d.) if they are 
linearly dependent with non-negative coefficients; positively independent (p.i.) 
means not p.d. 


Received March 12, 1953. 
108 





ON SUMS OF VECTORS 109 


3. Lemmas. 
(1) From any m( > n + 1) p.d. vectors 01, 02, ..., On, 2 + 1 p.d. vectors 
By.» Ousy - - + «> Onny, CaM be selected. 


Proof. \t is sufficient to show that m — 1 of the given vectors are p.d. If, 
in the given relation 


> 4, = 0 d, > 0, 


(at least one d, being positive), one d, is zero, this is trivial. If all d, > 0, 


choose any linear relation between 6;, . . ., O41, 
n+l 
> a4, = 0 (not all a, = 0), 
veel 
and consider the relation 
n+l ~ 
> (d, — xa,)0,+ >> ds, = 0. 
vl van+2 


For x = 0 all coefficients are positive; hence x can be chosen such that one 
coefficient vanishes, the others remaining non-negative (and d,, positive); 
the ensuing relation expresses the p.d. of m — 1 of the vectors. 


1.1 
_ Bass - + Ones, 


in (1) may be prescribed to include 6. 
Proof. Suppose 6, is not already included. Let 


n+l 
> o4,, = 0 
t=—1 
be the relation expressing the p.d. of 
ee 
If one b, = 0, the term 0-6, may be substituted for 54,,. If all 6, > 0, consider 
any linear relation between 6,,, . . ., 9., 91: 
pm Ou; + Cnaifi = 0 (not all e, = 0). 


i=—1 
It may be assumed that é,4; > 0. If all e, > 0, 
ae, a 


are p.d. If one e, < 0, consider the relation 
De (be + 264) Oa, + Ont iOnn ss + XentiOs = 0. 
t= 


For x = 0 all coefficients in the first sum are positive; hence x > 0 can be 
determined so that one coefficient vanishes, the others remaining non-negative 
(and 6,4: positive). The following corollary is obvious: 











110 F. A. BEHREND 


(1.2) In (1) and (1.1) 6,, may be excluded from 
Opis e+ Opes, 
unless 01, . . ., g—1 are p.i. 


(Il) If m>n, 
6= > 4p, <0, 0<d, <1, 
p=l 


then @ can be expressed in the form 
. [O0<d’ <1, u<lt+n, 
@= 2 4,'6,', | d,’ = 1, up>l+n, 
where 1 <1 < m, and the 0,' are a rearrangement of the 0,. 


Proof. Let r be the number of d,’s with 0 < d, < 1. If r < n, then the re- 
quired relation is obtained from the given one by omitting the terms with 
coefficient 0. It is therefore sufficient to show that, for r > + 1, the value 
of r can be diminished. Suppose that 0 < d, < 1 for 1 << u< +1; using 


a linear relation 
n+l 


a. a,0, = 0 (not all a, = 0), 
p=l 


form 


n+1 m 


@= > (d, — xa,)%+ > d,b,. 
p= w=n+2 


For x = 0 the first m + 1 coefficients lie between 0 and 1; hence x can be chosen 
such that one coefficient becomes equal to 0 or 1, the others remaining > 0, 
< 1. As 6 # 0, the final representation of @ contains at least one term, i.e., 
l<m. 


(11.1) The representation 
@= > d,’d,’ 
p=l 
in (11) may be so chosen that either 6, = 0,' or 0; does not occur at all. 
Proof. Suppose @; occurs in the relation obtained. (II.1) is obvious if the 
coefficient of 6; is less than 1 or if fewer than n coefficients are less than 1 (only 


a trivial reordering of the 0,’ being required). If the coefficient of 6; is 1, and 
exactly n coefficients are less than 1, i.e., 


0<d,’ <lforuy =1,...,l+n-—1, 


0, = 0,’, s>l+n, d, = 1, 


use a linear relation 
l+n-1 


> 5,6, + 60,’ =0 (b,, b, not all 0), 
u=l 


to form 





ON SUMS OF VECTORS 111 


l+n-1 m 


@= > (d,’ — xb,) 0,’ + (1 — xb,)0:+ > d,’6,’. 
pal pm lin 
wets 


It may be assumed that 5, > 0; letting x increase from 0, either @, can be 
eliminated from the relation, or one of the first coefficients can be made equal 


to 0 or 1 (the others remaining > 0, < 1); in this case @, can be incorporated 
in the first m terms and be renamed @/’. 


(III) If k > 2, |6, 








<1 (l<« <b), 
n=d0—>.db, O<d<1,0<d,<1(«>1), 


a=? 
then 
0, + 6.) > 1, i<e<<¢o 
implies 


In| < V(R* — 3k + 3) +1. 
Proof. For k = 2, |n| < |\d10,| a \d 20>) <2 = /1+1; for k >3, 





(0, + 0,)* = 0," + 20,0, + 0, > 1, l<«<k, 
implies 
—20, <6) +075—1< 1, 
whence 
| k-1 | 
In] < |di 6: — > dh.| + |d.,| 
am? 
( k—1 k—1 2\4 
< a0. — DF di d,26,0, + (= di.) ¢ +1 
om? am? 


< {1+ (k — 2) + (k — 2)"}' +1 = ( — 3k + 3)' +1. 
(111.1) If the condition |\@, + 0,| > 1 is added in (111), then 
In] < ( —k+1)'. 
Proof. By obvious modification of the proof of (III). 
(IV) Ifk>2, |6, 








<1 (l<« <b), 
- x 
n= >d.6, = > (1l—d.), O<d <1, 


6, + 7’, f=-A+%", 


” 
then \t| > 1, implies 
ln] <k — (2 — v2). 
Proof. For k = 2, 
t? = (—6, + (1 — dz)62)* = 0," — 2(1 — d2)0:02 + (1 — d2)*O.” > 1 











112 F. A. BEHREND 
implies 1 — d, > 0 and 


2 
20:02 < aoa + (1 — dz) 6." < (1 — d2) 6,”, 
— 2 
whence 


n = 0: + 22602 + dO." < 0;° + dO." < 2, 


In] < V2 =2— (2 —- v2). 
Let k > 3. As’ = 0 would imply \n| = \@,| < 1, it may be assumed that n’ ¥ 0. 
Similarly, Ie| > 1, implies ¢’ ~ 0. Let 6,’,..., 6’ be the projections of 62, ..., 
6, into a plane containing 7’ and ¢’. Then 
o o k 
v= 2edb', Y= Ve(l—d)o’, +e = 26’, [al <1. 
It may be assumed that the component of every 0,’ (2 < « < k), and hence the 
component of {’, in the 7’-direction is positive, as otherwise |n'| < & — 2 and 
in| = |. + 9'| << k —1<k — (2 — V2). The @,’ may then be so renumbered 











that they form a convex polygon which encloses the parallelogram formed by 
n’, ¢’. Defining w’ as shown in the Figure, 


(1) nf =o’ + 2’, s>0, 





ON SUMS OF VECTORS 113 


where 
kl 
(2) lw" < > |0.’|<k-2 
a=? 
z 
(3) o”| + (2 + 1)Ie"| < > |6,’| <k —1. 
a=? 


By assumption, 


f= (-6, + ¢’)? = 0," — 20, +e? > 1, 


whence 
20 < 6° +27 -1< ¢”, 
and 
(4) (st + 0:)° = 2°¢"? + Qet'@, + 0," < (2° + 2)? +1 


< (@+ 1)? +1< (R-1- |o'|)*? +1, 
by (3). By (1), (4), 
In| = |n’ + 0;| = |w’ + 2t’ + 0,| 
|o”| + |2t’ + 01] < |w’| + ((e — 1 — Jw’|)* + 1)! 
The last expression increases with |w’| and takes its greatest value, by (2), 
for |w’| = k — 2, i.e., 
ln] <k —-2+4+ V2. 
(V) Jf 
g= t+ YO I <a, |n| <6,b>0,/6,) C1 (1 <u <m), 
< m < 2a(a — 5b) 
(which implies a > b), then 0, = 0,' can be selected such that | + 0,'| < a. 
Proof. Select 6;' = 0, such that (¢ + 6,)? < (& + 6,)* for 1 < uw < m; then 


dL (+4) = 1 me + 2&(n — §) + ¥ «,') 


1 
m uml 
<(i- 3)¢ +2eti<(i-2)o+ 241 
m m 


=a’ — 2a(a — b) — m <a’, 
m 


(§+ 6’) < 


provided that m > 2. For m = 1, 6,’ = 4, lé + 6,'| = |n! <b <a. 


(V.1) Under the conditions of (V) a rearrangement 0,', . . ., Om’ of 01, . . «5 Om 
exists such that 





<a, l<eq<m. 


wel 


Proof. Successive application of (V). 











114 F. A. BEHREND 


It can easily be verified that the conditions of (V) and (V.1) are satisfied 
in the following two cases: 


(V.2) a= (mn? —3n+3)'+1, b= (k — 3k 4+3)'4+1, 
2<k<n-1, lom<2n— bk. 
(V.3) a= (n?—3n+3)'+1, b=1, 1<m< 2n’ — 4n+3. 
(VI) Ifm>1,|a| <1 (l1<u<m),a>0,b>0, 
n=t+ > 4, <a’, P <a’, 
p=l 
then 0, = 6,' can be selected such that 
(¢+6;')? <a? +b2, bo = M+. 
Proof. For m=1, (€+6/) = (+4)? =F <@< +1 =a + Dy. 
If m > 2, select 6,’ = 6, as in (V); then 
- -2) 2,2 
e+ay'<(1 =Oe + Ste? I 
2 2 2 2 2 2\3 
<\1-—=)@+0)+=—a@e +d)'+1 
™m™ m 


_2 2 2 2,2 132 
<(1 2 \ (a +b) +> G@ + 9b) +1 


m— 1 


’+1=a' +b,’ 





= a os 
(VII) If m>1, |@,| <1 (l<u<m), 
n=f&+ 2D % In| <a, |é| <a, 
then a rearrangement 0’, . . ., 9m’ of 81, . . -, Om exists such that 


f(m)y = max (= + > «,') <a’ + i+e'(m — }), 


1<e<m u=1 
for m > 1, and in particular, 
fay’ <a’, f(2)? <a’ +1, 


$8)’ <a’ +3, f(4)’ <a? + 7%. 
Proof. Applying (VI), with b = 0, 6,’ can be selected such that 
fv = (€ + 01)’ <a’ + By, b, = 1; 
applying (VI) again, 62’ can be selected such that 
2 2 ‘ 
fs’ = (6: + 03')* = (: +2 0) <a® + by’, by) =  — bs + 1; 
b= l ae 


and continued application of (VI) will lead to 





ON SUMS OF VECTORS 115 





. 2 _ 
= Gea tay= (e+ x 0) <P +N, We eat oe +h 
for g < m. Hence, 
f(m)* < a* + b/, b; = max b,’. 
N 1<¢<m 
Now 
: 1 
(6) by = (m — 9) —, 
and 
2 2 b? - 1 
ia = in «2 = — i 
Derr — by 1 ar 1 ese = af 











r r—1 1 
‘~~ a 60s <5 
i.e. 
r—1 1 r 1 
(6) ) Pin $34 2e- 
Now 
<1 f-#- t+} 
ae 4 tle Es 
hence, by (6), 
m— 3% 
1<gee-k 
whence 
(7) m—r<e'(m — 4) + }, 
and 


r—1 
fim)? <a +b7 =a? +14+ (m—r)>d . 


1m — Kk 
<a’ +1+ ('(m— 4) +4) 1 =a +3+e"(m — 4), 
by (5), (6), (7). The relation f(1)* < a? is trivial. For 
m = 2, 5; = 5. = 1, whence f(2)? < a? + 1; 
m = 3, b;? = 1, by? = 3,5; = 1, whence f(3)? < a* + 3; 
m= 4,b,* = 1, b? = §, bs? = 4, b? = 1, whence f(4)? < a? + 4. 
(VII.1) Jf, im (VII), 
a = (n* — 3n + 3)' +1, n>3, m<n, 





then 





le+ 2d, 6,’ 


where g(n) is defined in the following proof. 


< g(n) <n, l<q<m, 











116 F. A. BEHREND 


Proof. For n = 3, 


@ 2 
(: + > «’) < (V3 +1) +3 = 4+ 2V3 = g(3)’, 2(3) < 2.995 <3. 
p=l 


For n = 4, 


(. +> a) < (V7 + 1)° +44 = % + 2V7 = g(4)*, g(4) < 3.89 < 4. 


w=! 


For n > 5, 


(s+ So,') < {(n* — 3n + 3)' + 1)? + e"'(n — 4) +3 = gin)’, 


a= 


where 
a(n)’ << (n—S) te BN) +i=n'- G-enth—a 
<n —h+ 3" <n’, 
i.e., g(m) <n. 
4. THEOREM 1. Forn >3,c, <n. 


The proof is in several steps. 


4.1. Let 


a, = 0, lae| < 1. 
A rearrangement 
65 = a, 52: =ay,, ..., Sp-1 =ae,_,, 5, = ay 


is to be constructed such that 


rs 


r=! 


<g(n) <n 








for | < q < p. We use induction with respect to p. For p = 1, in fact for p < 
2n — 1, the result is trivial as no reordering is necessary: 


Ea 


r=1 ' 





D 
¥ a| < min (¢g, P — g) < min (g,2n — 1 — ¢g) Cm — 1. 


r=q+1 | 





In the following it will be assumed that the result is true for p’ < p. 
If a partial sum 


@ 
f=a+ 2 as, 2<q<p-2, 
i=m2 
has a modulus < 1, then the result may be applied to 


q 
a+ Doar, + (—$) =0 (p’ =1+q< >), 


and to 





ON SUMS OF VECTORS 117 


pl 
f+ Dia, +a, =0 (p = p—qt+1<p), 


i—¢+l1 


prescribing a, and — ¢ in the first case, ¢ and a, in the second case, as first and 
last vectors of the rearrangement; combining the two arrangements and omitting 
the vectors — ¢ and ¢, the desired rearrangement of the a, is obtained. In the 
following we may therefore make the assumptions: 


(VIII) If ¢ is @ partial sum of the a, containing exactly one of a, a, and at 
least 1, at most p — 3 other vectors, then |t| > 1. 


In particular, 


(VIII.1) la; + ae| > 1, 2<e r<p-l. 
Also, : 


(VIII.2) No partial sum is 0, except possibly a, + a, and 


pl 

Tn, 

r=2 

For let ¢ be a partial sum other than the above, and ¢ = 0. The following 

cases may arise: (a) { contains neither a; nor a,; in this case lr - a,| < i, 
contradicting (VIII); (b) ¢ contains one of a, a,; this directly contradicts 
(VIII) unless § = a or § = a, or f = a, + ant... +a, 0r f = a + a34+ 
... + a,, which implies a; = 0 or a, = 0 and reduces the number of vectors 
to p’ = p — 1; (c) ¢ contains both a; and a, and at least another a,; removal 
of a, gives |t — a, < 1, again contradicting (VIII). 


4.2. The desired rearrangment of the a, will be obtained in three stages: 
(1) a rearrangement §;, 82, . . ., By; 
(2) a trivial alteration 71, y2, . . ., y, of (1) obtained by placing a, first; 
here certain special partial sums 


Y Bites. 


with not too distantly spaced values of g, q’, . . . have a modulus less than n 
(more precisely, less than a bound somewhat smaller than n); 

(3) the final rearrangement 4,, 52, . . ., 5, obtained from (2) by reordering 
the vectors within each group 7,41, . . ., Y¢ leading from one special partial 
sum to the next. 

The 8,, yr, 5. will be defined inductively as follows. Suppose an index i, 
1 <i < p, has been found such that 

(i) 8, have been selected from the a, for » < i; 

(ii) the non-selected vectors, €;, . . ., €», say, satisfy a relation 


Pp 
> e« = 0, 


where 











118 F. A. BEHREND 


(iii) O< e, < 1 fory <i+n,¢,=1 fory >i+n; 

(iv) a, is one of the ¢,; and if the ¢«, other than a, are p.d. then a, = ¢,; 

(v) if a; is one of the ¢,, then a; = «€:; 

(vi a) if a; is one of the e,, then 7, . . ., y, are the vectors a, 8:, .. ., By1; 
and 


t t—1 


t= > y=a+ DB, 


v=l v=l 
is the special partial sum belonging to the index 7; 
(vi b) if a; is one of the 8,, a, = 8, say, then y:, . . ., y«~1 are the vectors 


a, Bi, eS | B,-1, Brat, “+ By-1; and 
i—1 i—1 
g st > Te = > B, 


is the special partial sum belonging to 7; 
(vii) |&| < (n* — 3n + 3)# +1; 


(viii) 5;, . . ., d¢-1, (6,) are a rearrangement of 1, . . ., yi-1, (y,) with 6; = 
yi = Q@; 
g 
(ix) 5) <ein) <n g=1,...,#-—1, (4). 
val 





Such an index i will be called a special index. 


The index 1 = 1 is special: (7) is void as no 8's have to be selected; the given 
relation 


p 
) a, = 0 
r=] 
plays the role of (ii) (a, = €,); (iii), (iv), (v) are satisfied; defining 6; = y: = 
a, = &, (vi) and (viii) are satisfied; (vii) and (ix) are trivial. 
To every special index i, with 1 < p — 2m, a new special index j > i will 
now be constructed (the construction will preserve the vectors 8,, y,, 5, already 
selected for the index 7). 


4.3. Relation (ii) contains p — (i — 1) > 2n+i—-— (¢-—1) = 2n+1 
terms. Applying (1) to €;, . . ., €,, we select m + 1 p.d. vectors 


Eur + + » Eun aar 
where we include 


og = Cun +i 
by (1.1), and exclude ay, by (1.2), if possible (i.e., certainly when a, = ¢, and 
€;, - « -» €&-1 are p.d.). If the relation of expressing p.d. is 
(8) oe, + Dd au, = (), a, > 0, not alla, = 0, 
j=l 


then, for all x, 


Ld n 
(9) > e & — x(aoe, + > a; &,) = (). 
j=l 


v=t 





ON SUMS OF VECTORS 119 


For x = 0 all coefficients are positive and < 1; hence a positive value of x can 
be determined for which (at least) one coefficient becomes 0, the others remaining 
> 0, < 1. At most 2 coefficients can be less than 1 (those of €,, . . ., €s0—1, 
fx,» + + +» €m),» SO that at least two coefficients remain equal to 1. Renaming the 
€>: €¢, €441', ~~ +» € , taking first the vector or vectors with coefficient 0, then 
the remaining ¢, from €;, . . ., €44n—1) €s,» - « -» se» and then the remaining ones 
with coefficient 1, (9) will read 


Pp 
(10) > «,’ «’ = 0, 0<e, < lfory <<i+2n, e,’ = 1forv >i + 2n. 


ve i+] 
4.4. Put 
i+ 2n—1 
(11) ¢= \ - ey €, (0 < Oe < 1), 
v= i+1 
so that (10) may be written 
D 
(12) e+ > «/ =0. 
v= i+2n 
a, cannot be contained in the partial sum 
D 
@ ; 
v= i+2n 
for if a; occurs in (9), then a, = ¢€, by (v), ie. a; is one of €;, . . ., € 2-1; by 


(VIII.2) the partial sum cannot vanish, whence « # 0. By (II) « can be written 
in the form 


1+2n—1 


(13) «= > fib, O<f, <lforv <<it+l+n, f, =lfory>it+l +n, 
peri}! 


where 
(14) 1<l<2n-1, 
and $441, - - -» Pi+2n—1 iS a rearrangement of ¢’ 441, . . ., € s42n—1. By (II.1) it may 


be assumed that 


(15) if a is still present in (13), then a, = $44,. 


Define 
(16) jritl; 
then, by (14), 
(17) i+t1l<j<it2n —1. 
It will now be shown that j is a special index. The properties (i), . . ., (ix) relating 


to j will be denoted by (i’), . . ., (ix’). 


4.5. (i’) By (i), B, is defined for » < i; defining 
Bi=e', Bur = dun «--, Bir = Oy-1, 
8, are selected for » < j. 











120 F. A. BEHREND 


The non-selected vectors are @,, . . -, Oi+2n—1 and € ¢y2n, . - ., & Which will be 
renamed ¢i42, - . -, ¢»- Substituting (13) into (12), we get 
(ii’) ¥ fede = 0, 
where se 
(iii’) 0<f,<lforv<j+n, f,=lfory>j+n; 
note also that 
(18) fy = lforv >i + 2n; 
in particular, 
(19) fai =f, = 1. 


(iv’) a, is one of the ¢, (v > j). For, either the «, other than a, are p.i.; 
then, a fortiori, the ¢, other than a, are p.i.; but (ii) expresses the p.d. of the 
¢, other than a, unless a, is present in (ii’); or the ¢, other than a, are p.d.; 
then a, = ¢, by (iv), and a, was excluded from (8), so that a, = «, = «€,' = @y. 
This latter case certainly arises if the ¢, other than a, are p.d., for this implies 
the p.d. of the «, other than a,. 

(v’) If a; is one of the ¢, (v > j), then a; = ¢,, by (15), (16). 

(vi’ a) If a is one of the ¢,, then 7y:, . . ., y, are the vectors a, 81, . . ., By-1: 
and 

j 


1 
2= Dy=a+ DB, 


vel v=) 
will be defined as the special partial sum belonging to the index j; 
(vi’ b) if a; = 8,, r <j, then yi, . . ., ys-1 are the vectors a, 8;, . . ., 8,1, 
Brat, . . -» By-1; and 


j-1 i-1 
7 = >» Y= >. B,. 


v=) v= l 


These definitions are consistent with the definitions (vi). 


4.6. We now investigate the special partial sum 7. 
In case (vi’ a) 


j-1 Pp p 
m= at 2B =a- 2 b=- 2 by (v’) 
Ll ve j y= j+1 
p p 
= —- do + > fe by (ii’) 
vm f+ v= j 


= fn x0 the, 


and asf, = 1 for y > min (j + n, i + 2n) by (iii’) and (18), 
j+k-1 


(20) n = fy — ) (1 — f,)@,, 


ver +1 
where 





ON SUMS OF VECTORS 121 


jJt+k—1=min (jf +2 — 1,4 + 2n — 1) 


i.e., by (16), 


(21) k = min (m, 2n — (j — i)) = min (mn, 2n — 1), 

whence, by (14), 1 < k < n. The case k = 1 can be excluded as it would imply 
\n| = [fas] < 1 where 9 is a partial sum with j = 2n+1— 1 terms (2n < 
j <p — 2), including a, excluding a,, which contradicts (VIII). Thus, 

(22) 2<keon, 1<l<em—2, itl Sjcit Qn — 2. 

As |a:| < 1, |¢,| < 1,0 <f,<1,0<1-—f, <1, and, by (VIIIL.1) |e: + 4, 


> 1, except, possibly, for ¢, = a,, (20) satisfies the conditions of (III), and 
we have 


(23) In| < (k* — 3k + 3)' 41. 
As k < n, this implies 
(vii’) |n| < (m* — 3n + 3)4 +1. 


In case (vi’b), 





j-1 DP P p 
7 = 2d 8 =—->}#=- >L¢+ Lie, by (ii’) 
= al | vj vm J 
Dp j+k-1 
= 2d (1 -—f)(-@) = D (l-f.)(-@), 
— vm j 


by (iii’) and (18), where k is defined by (21); & = 1 would imply | (1 — f,¢,| 
= |n| < 1, hence can be excluded as above; thus, (22) will hold and » may be 
written 

j+k-2 


(24) n= Dd (1 —fs)(— o>) + (1 — Sees) (— Oype-1)- 


os) 
We may assume that a, = $, Or ay = $442-1, SO that the partial sum 


Dp 


t= ) 


ve j+k—1 


contains a,, but not a, and f = p — (j + k — 1) further terms; j > 2, k > 2 
imply f < p — 3; (21) andi < p — 2n imply f > 1; hence, by (VIII), 


rs a 2 
Now, 
p 

(25) t= o> — DL frd 

v= j+k—1 ves 

j+k-2 

= D fl—s) — (1 — ferns) (— Os40-1). 

v= 
(24), (25) satisfy the conditions of (IV); hence, 
(26) In] <k — (2 — ¥2), 


which implies (23) and (vii’). 











122 F. A. BEHREND 


4.7. It remains to establish (viii’) and (ix’). By (vi) and (vi’), the three 
possibilities are: 


(27) ‘n=t+ > y,, 


vm i+l 


on8 


n=&+ pm Ye. 


vm i+1 





The y, contained in ¢ have already been rearranged as 4, according to (viii) 
to satisfy (ix); it therefore remains to reorder the y, under the summation sign 
in (27). There are m such y,, where m = 7 —i1 =lorm=j-—-i-—-1=/-—1, 
i.e., 1 << m <1. (The case m = 0 is trivial, since then 7 = £, 8B; = e, = a; and 
the vectors considered in (viii’), (ix’) are identical with those of (viii), (ix).) 
We distinguish two cases: 

(1) 2<k<n-—1. By (21), k=2n-—1, 1S m<2n—k. Together 
with (vii) and (23), these are the conditions of (V.2) for (27) which guarantee 
the required reordering (viii’) of the y, satisfying (ix’), the bound obtained 
being (n? — 3n + 3)# +1. 

(2) k=n. By (21), n < 2n —1, whence m </1 <n, and by (vii) and 
(vii’), (27) satisfies the conditions of (VII.1) which guarantee the required 
reordering (viii’) of the y, satisfying (ix’), the bound g(m) being defined as in 
the proof of (VII.1). 

As g(n) is greater than (mn? — 3m + 3)! + 1, the bound g(m) may also be 
used in case (1). 

This completes the proof that j is a special index. 


4.8. The procedure of selecting the 8,, y,, 5, can be continued until a special 
index 7 is reached for which 1 > p — 2n. In this case 6, . . ., 64-1 or 51, . . ., 5; 
have been correctly selected, and the corresponding special partial sum is 


t-1 
t= D6, 


v=l 


or 


If the remaining vectors are called y;, . . ., ¥p = @p OF Vist, - - -» Yp = Gy TESPEC- 
tively, then 


3 
ll 
| 
& 
Il 
+ 
M 
~ 


or 


) 





ON SUMS OF VECTORS 123 


satisfies the conditions of (V.3), because the number of y, is m = p — i or 
p—t-—1, whence m < 2n < 2n?—4n+3 (for »>3). Reordering the 
y» according to (V.3), and choosing a, as the last vector, the rearrangement of 
the given vectors is completed. 


5. THEOREM 2. ¢3 < (6+ 2 +/3)§~2.91. 


Proof. For any special index i (1 < i < p — 7), relation (ii) of §4.2 reads 
(n = 3) 


1+2 Pp 


Dee + Dd « =0, 0<e< 1. 
vem i vm i+3 
We shall prove (cf. (vii)) that 
lf] <1 + V2, 
unless both a, a, are present in 
1+2 


LD ex, 


and the coefficient of the third vector is less than 1 (in this case (vii) gives 
1t| < 1 + +/3). If a: is not present in 


+2 


2, Cte 


vet 


the reasoning of 4.6, (vi’b) applies leading to (26), which for k < 3 gives the 
estimate 1 + 4/2. If a; is present, a, absent, then 


E = ee — (1 — C441) €r41 — (1 — €442) €4423 


£ = €43 te. . He = — Ctr — Crp 1€s41 — Crp 2€ 142, 


then |¢| > 1, by (VIII), and |¢| < 1 + +/2, by (IV). If, finally, a1, a, are both 
present, but the coefficient of the third vector is 1, then 
|| = lear — (1 — eu1) a,| € 2< 14 V2. 


The relation between the special partial sums £, » belonging to two successive 
special indices i, j is given by (27), where m < 1 < 2n — 2 = 4. We distinguish 
two cases: 

(1) One of the two partial sums, say 7, has modulus less than 1 + 1/2, i.e., 


lf] <1 + V3, |n| < 1+ v2. 
If the y, in (27) are called 6;, . . ., 0, then 


n=t+ 2 4, 
pol 
Let 6;’, . . ., Om’ be the rearrangement of 6;, . . ., 8, according to the principle 


used in (V). Then, for m = 4, 











124 F. A. BEHREND 


(& + 01)? < $8 + Bel - Inf +1 
< (V3 + 1)? + V3 + 1)(V2 +1) +1 < 8.04, 
t+ 6;'| < 2.84, 
(E + 0:’ + 0,')* < 5 X 8.04 + 
\— + 6:’ + 62'| < 2.88, 
(— + 61’ + 02’ + 0;')* < 2.88 X 2.42 +1 < 7.97, 
|—E + 0: + 02’ + 03'| < 2.83; 


the maximum estimate, 2.88, is less than (5 + 2 / 3). The cases m < 4 are 
treated in the same way. 

(2) The estimate 1 + 1/2 is not available for either of , 7. This means, by 
(vii), that both (ii) and (ii’) contain a; and a, in their first three terms, the 
coefficient of the third term being less than 1. By (iv), the e, other than a, are 
p.i.; hence (8) contains a; = €;, a» = €441, and two other vectors ¢,,, ¢,,. In the 
transition from (ii) via (8)—(13) to (ii’), a1, a, are retained together with at 
least one of €42, €,, €» i-€., at most two vectors are eliminated. Hence, 
m=1=j—1< 2; m =1 means » = £ + 6, which requires no reordering; 
m = 2 means n = £ + 6; + 62, and 6,’ can be selected from 6), 62 such that 


(§+6,°)? < (1+ V3)? +1 =5+ 2v3, 
lf + 6] < (5 + 2V/)!. 


wire 


X 2.84 X 2.42 + 1 < 8.27, 


REFERENCES 


1. E. Steinitz, Bedingt konvergente Reihen und konvexe Systeme, J. reine angew. Math. 143 
(1913), 128-175; 144 (1914), 1-40. 

2. W. Gross, Bedingt konvergente Reihen, Monatsh. Math. Phys. 28 (1917), 221-237. 

3. V. Bergstrém, Ein neuer Beweis eines Satzes von E. Steinitz, Abh. Math. Seminar Hamburg, 
8 (1930), 148-152. 

4. —, Zwei Sédtse tiber ebene Vektorpolygone, Abh. Math. Seminar Hamburg, 8 (1930), 
206-214. 

5. I. Damsteeg and I. Halperin, The Steinits-Gross theorem on sums of vectors, Trans. Roy. 
Soc. Can., sec. III, 44 (1950), 31-35. 


University of Melbourne 





ON A THEOREM OF OSIMA AND NAGAO 
J. S. FRAME anp G. pe B. ROBINSON 


1. Introduction. If we define the weight b of a Young diagram containing 
n nodes to be the number of removable p-hooks where = a + bp, then three 
fundamental theorems stand out in the modular representation theory of the 
symmetric group S,. 


1.1 Two irreducible representations of S, belong to the same block if and only if 
they have the same p-core. 


This has been proved in various ways (1; 5; 7). 


1.2 The number |, of ordinary irreducible representations in a block of weight 
b is independent of the p-core and is given by 


bid 
l= Do Pos Pos --+ Po» (3 0.-8,0<5,<8). 


The enumeration here is based on the 1-1 correspondence holding (5;8) 
between the representations [a] with a given p-core and the associated star 
diagrams [a],*. 


1.3 The number I’, of modular irreducible representations (indecomposables 
of the regular representation of S,) 

(i) is independent of the p-core, and 

(ii) is given by 


pl 
= > Po. Po. - ++ Pos-, (Fs.-5,.0<5<8). 


Theorem 1.3 (ii) was recently proven by Osima (6) assuming 1.3 (i) (8); 
Nagao (4) obtained 1.3 (i) and (ii) directly. We give here another version of 
Osima’s proof which yields, in addition, generating functions for the number 
of p-cores containing a nodes and the number of blocks (1) to which the repre- 
sentations of S, belong. 


2. Proof of 1.3(ii). The partition generating function 


(2.1) P(x) = 1+ pix + px’ + px't+... 

= {(1 — x)(1 — x*)(1 — x*)...}7° 
is well known (2, p. 272). It follows from 1.2 that 
(2.2) P(x) = 1+ het lx’? +... = [PA lx)P. 


Received October 10, 1953. 














126 J. S. FRAME AND G. DE B. ROBINSON 


If we write 
(2.3) E(x) =1l+eax+ a 


when c, is the number of p-cores containing a nodes, then we may enumerate 
the ordinary representations of S, lying in all the blocks in the following manner: 
(2.4) € (x) L(x?) = C(x) (Aw)? = Ax), 

using 2.2 and the fact that m = a + bp. On the other hand, assuming 1.3 (i), 
we may write 

(2.5) L'(x) = 1+ Ux t+ le't+.... 

Since the total number of modular irreducible representations is equal to the 
number of p-regular classes of S,, we have 


(2.6) S$ (x) L(x?) = A(x) /P(x’). 
From 2.4 and 2.6 it follows immediately that 

(2.7) L(x?) = (P(x?) 

or 

(2.8) L' (x) = (Px) P", 


which is precisely the relation 1.3 (ii). 


3. The number of p-regular classes. We can say a little more, however. 
Setting 


(3.1) AM (x) = 1+ mx + mx’? +..., 

where m,, is the number of distinct blocks associated with S,, we have 
(3.2) M, = Cy + Coy t Crap t...-, 

so that 

(3.3) AM (x) = }(x)/(1 — x”) = Alx)/(1 — x)[A’)P, 
from 2.4. 


In this connection it is worth remarking that the generating function on the 
right hand side of 2.6, namely, 
Pix) _(1—2x)(1— x”)... 
P(x?) (l—x)(l—x*)... ’ 
can be interpreted in two ways. We may cancel each factor of A(x) with an 
equal factor of FP (x) and conclude that A(x)/P(x*) generates the number of 
partitions of m into summands not divisible by p, which is the number of p- 
regular classes. Or we may divide the kth factor (1 — x*)~ of FP (x) into the 
kth factor (1 — x”) of P(x”) and generate the number of partitions into 
summands no one of which appears as many as p times. Hence we have: 





(3.4) 


3.5 The number of p-regular classes of S, is equal to the number of partitions 
of n in which no summand appears as many as p times. 





‘" 


ON A THEOREM OF OSIMA AND NAGAO 127 


This result is of interest in the study of the indecomposables of the regular 
representation of S,; such partitions may indeed characterize them. 


REFERENCES 


1. R. Brauer, On a conjecture by Nakayama, Trans. Royal Soc. Canada, III, 41 (1947), 11-19. 

2. G. H. Hardy and E. M. Wright, The theory of numbers (Oxford, 1945). 

3. D. E. Littlewood, Modular representations of symmetric groups, Proc. Royal Soc. London 

(A) 209 | 1951), 333-353. 

4. H. Nagao, Note on the modular representations of symmetric groups, Can. J. Math. 5 (1953), 
356-363. 

5. T. Nakayama and M. Osima, Note on blocks of symmetric groups, Nagoya Math. J. 2 (1951), 
111-117. 

6. M. Osima, Some remarks on the characters of the symmetric group, Can. J. Math. 6 (1953), 
336-343. 

7. G. de B. Robinson, On a conjecture by Nakayama, Trans. Royal Soc. Canada, III, 41 (1947), 
20-25. 


8. , On a conjecture by J. H. Chung, Can. J. Math., 4 (1952), 373-380. 





Michigan State College 











A CONSTRUCTION FOR WYTHOFFIAN POLYTOPES 
G. C. SHEPHARD 


1. Introduction. This paper contains an account of a simple method 
of deriving the coordinates of the vertices of any uniform polytope or 
honeycomb (degenerate polytope) whose symmetry group is generated by 
reflections. 

Polytopes and honeycombs of this type have been described by many authors, 
amongst whom must be mentioned Schlafli (10), Gosset (8), Mrs. Boole Stott 
(14), Schoute (12; 13), Elte (7), Robinson (9), and Coxeter (1; 2; 3; 5). The 
whole theory of uniform polytopes was unified by Coxeter (4; 6, pp.86, 196), 
who adapted Wythoff’s construction (15) to obtain a general geometrical 
method for obtaining all the uniform polytopes whose symmetry groups are 
generated by reflections.' His discussion was elegantly illustrated by the use of a 
graphical notation (7, p. 191; 4, p. 329). 

One of the most comprehensive discussions of uniform polytopes in analytical 
terms is that of Schoute (11; 12), whose paper, in four parts, comprises a com- 
mentary of 190 pages on Mrs. Boole Stott’s geometrical methods. As Professor 
Coxeter remarked to me in a letter, “it is sad to think how much unnecessary 
work Schoute did, through not anticipating Wythoff’s construction.” 

This present paper is concerned with an analytical account of the Wythoffian 
polytopes and is based principally on the geometrical ideas of Coxeter’s paper 
(4). After the determination, for each group, of a set of basic vectors, the 
coordinates of the vertices of any uniform polytope associated with that 
group may be written down. A modified form of the same method can be 
applied to determining the coordinates of the vertices of the Wythoffian 
honeycombs. 


2. Finite groups. Suppose that @ is a finite n-dimensional group generated 
by reflections in m primes whose point of concurrency is O, the origin of the 
(cartesian) coordinate system. Considering the reflections as operating on an 
(n — 1)-dimensional sphere whose centre is O, the fundamental region of the 
group may be taken to be a spherical simplex whose bounding figures are the 
intersections of the sphere with a specially chosen set of primes, reflections in 
which generate the group (6, pp. 188-191). 


Received June 1, 1953. 


‘It is convenient to call these polytopes Wythoffian. In (4) Coxeter uses the word Wythoffian 
in a different sense to include some uniform polytopes whose symmetry groups are not generated 
by reflections, namely the “‘snub’’ polytopes in three and four dimensions. 


128 





A CONSTRUCTION FOR WYTHOFFIAN POLYTOPES 129 


If we represent the simplex (and therefore the reflection group) by a Coxeter 
graph (4, p. 329), then we may suppose that the ith node of the graph (the 
nodes being numbered in some arbitrary manner) corresponds to the prime p, 


which intersects the sphere in the bounding figure P; P,... Py) Py... P, 
of the simplex P,; P,.. . P,. 
Now define m basic vectors fT), f2, ..., 1 in the following manner: r, 


is in direction OP, and the distance of its end point from p, is 4. Thus, 
considering 


2.1 Fy = (ra, 7a, --- 5 7in) 
as the coordinate vector of the point R,, then the reflection r,* of r, in p, is the 


coordinate vector of R,* which is at unit distance from R,. 
The following is the basic result: 


2.2 One vertex of any Wythoffian polytope (of unit edge length) derived from 
the group & has the coordinate vector 


601 + eos +... + Gls, 


where «, = 1 tf the ith node of the graph is ringed, and «; = 0 if the ith node of the 
graph is not ringed. 


The other vertices of the polytope may be found by applying the operations 
of G, that is, by repeated reflections in the primes pi, Po, . . . , P,. Consequently, 
when we have found the set of basic vectors for G, we can immediately deter- 
mine the coordinates of any polytope found by applying Wythoff's construction 


to &. 


2.3 Example: @ = C;, the symmetry group of the cube (order 48). 





l 





Graph Number of node | Basic vector | Reflecting plane | 

| 1 | m= (1,1, 1) x,=0 | 

4 | | | 
2 | te = (V2, 2,0) | x2 — x3 =0 


3 


| | 


0 


Ts 


$(/2, 0, 0) | oe Xe 





The operations of this group correspond to permuting the coordinates in 
every way and also to changing the sign of any one. We write the values of the 
¢, in the form (e;, €2, €;) and so derive coordinates for the seven Wythoffian 
derivatives of C;, as in Table 2.4. 

The proof of the basic result follows immediately from Coxeter’s account 
of Wythoff’s construction. Evidently the “first vertex’’ is left invariant by a 








Aem aiqissod AsaAa ul paynuiied aq 0} aie sajzeuIps00d ay} yey} saljdun (,) sud y,; 











io 107 uorpeyezoqns peywounsy | (TF (ZA4+1) F (ZAZ+D F) ST (1‘2A+1‘zAzetr) | (‘tp 
jean oqnd peyeouns |, ALF ‘(A+1D ¥ ‘(@A+D ¥) | FZ | (L‘ZA+T'‘ZA+1) | (O'T'D 
BAS '0y uOIpPIYyeyDoqGno1qwWIOY y AL ‘IF (ZA4F+D ¥) | rz | (TT ‘@A+TD) (I ‘0 ‘I 
a tAe'ly uOIJpsyeyO poazeouNs | | 0 ‘ZX ‘CAZF) | £Z | (0 ‘2 ‘%AZ) (1 ‘Tt ‘OD 
: SAsy uOspPEYyeIIO) (0 ‘0 ‘tA *) 9 | (0 ‘0 ‘%/) (1 ‘0 ‘O) 
wn 
v PANY uoIpeyeyoqgn’) (0 ‘3A ¥ ‘3A ¥) Zl | (0 ‘ZA ‘%A) (0 ‘I ‘O) 
th am 8h eqnd (IF ‘1 ‘T#) 8 (I‘t‘T) | (0‘O‘T) 
owe N ,S901}J9A JO S9}UIPIOO>) $991}J9A X9}J9A 4SII] (t ‘Sa ‘T) 
jo saquinyy 























(-z y} Bua] eBpe jo espayAjod 0} a7e]Ja4 os pue ‘Zz Aq parjdiyynus useq savy 2JqQe} SIy} Ul SayeUTPsOOD 24} ||V) 


‘> ypim paynisossp vapaykjog fo aqvl FZ 


& 








A CONSTRUCTION FOR WYTHOFFIAN POLYTOPES 131 


reflection if the corresponding «, is zero, or is transformed into a point at unit 
distance if the corresponding ¢, is unity. 

By allowing ¢, to take other values, the vertices of polytopes may be derived 
whose bounding figures are parallel to the corresponding bounding figures of a 
uniform polytope but whose edges are equal in length to the values of the non- 
zero ¢€;. (See for example (4, p. 336). Here a ringed node marked 4/2 is taken 
to indicate that e,; is to be given the value 4/2.) 

The basic vectors, and their reflections in the primes, are precisely the trans- 
lations to be effected on the bounding figures in the ‘‘expansions’’ and “‘con- 
tractions’’ of Mrs. Stott’s method (14). 


3. Infinite groups. A similar method may be used for the coordinates 
of the vertices of a degenerate polytope (honeycomb) in m dimensions. In 
this case the fundamental region consists of a Euclidean simplex P; P:.. . Py+; 
of which the vertex P,,; is chosen as the origin O of the coordinate system. 
The method differs from that for finite groups on account of the fact that if 
all the Wythoffian polytopes associated with a given group have edges of unit 
length, the “‘scale’’ of the group (i.e., the size of the fundamental simplex) may 
be different in each case. We proceed as follows: 


The primes P:, Po, . .., P, are defined to be the faces of the fundamental 

simplex that pass through O, and p,4:(c) to be the prime 

dD aam,=c 

t=1 
parallel to the face P; P, . . . P, of the simplex, and normalized so that Sa? = 1. 
The size of the simplex is therefore altered by varying the value of the constant 
c. The vectors r, are defined as before, that is, r,; lies along the line of inter- 
section of all the primes except p, and p,4; and its end point is at distance 
4 from p,. Also f,4; is the zero vector. Define also the m + 1 constants by the 
relations 


C4 (¢@=1,2,...,), 


n 
) a afi; 
j=1 


} = Cn+1s 
where fr, is taken in the form 2.1. 
The first vertex of the honeycomb has coordinate vector 


61 + €ofo +... + €ngi0 ays 


where ¢, = 1 if the ith node of the graph is ringed, e, = 0 if the ith node of the 
graph is not ringed, and the other vertices are given by repeated reflections of 
the first vertex in Pi, Po, ... , p, and 


} Aly = €1C) + €2Ce +... A Eng i€n41- 








i ” 4 , | _ - - 



































suisiid ;euo3e}30 ((ZA+1) pow) | 
pue espeyezooqno payeounsy | Pgt F107 ATF (ZA+D F (ZAZ+D¥) (1 ‘ZA+1 ‘ZAZ4+D | (UTI ‘T) 
| 
eipey | 
| -BJDOqndIqwuioys puke susiid yeuod | ((Z2A+2Z)zZ pow) 
|-e120 ‘saqnd ‘saqnd pezyeounsy = | rgt't'9 ALF (@A+D F (ZA+DF) | (1 ‘CAt+T ZA+D (1 “0 ‘T ‘T) 
e1peyezoo peyeouns) pue | ((ZAZ+1)Z pow) 
|seqno ‘eapayeyooqno paywounsy | r98't%7 ALF (AFD) F (AGED) [(t‘tA+1 ‘ZAZtD | (O'T ‘TD 
; 
| BspoyeyI0 poyeounsy | rge"ty (ZA¢ pow) (0 ‘ZA ¥ ‘ZAZF) (0 ‘ZA ‘%Az) | (o'r ‘t‘O) 
2 
= soqn> | *g=gt'y | (¢ pour) (IF ‘IF ‘IF) | (1 ‘I ‘T) (1 ‘0 ‘0 ‘DD 
re 
= vspaye390 ((¢h+1)% pow) | 
. pue saqns ‘eipeyeyooqnoiquioyy =| *g* AL ‘TF ‘(ZA+1) ¥) (1 ‘Tt ‘3A+1) (0 ‘tT ‘0 ‘1D 
J 
C ((ZA+1)% pout) 
B1payejoo puke saqnd pezyeounsy | Pgt'0 ATF (ZA+DF (ZA+DF) | (1 ‘ZA+T ZA+D (0 ‘0 ‘I ‘TD 
BIpeyezooqns pue weipesyejoQ | 97 (Az pow) = (0 ‘AF ‘ZA F) (0 ‘ZA ‘%A) | (0 ‘0 ‘T ‘0) 
saqn> |e = 9% | (z pour) (1 ‘IF ‘1¥) (I ‘I ‘T) (0 ‘0 ‘0 ‘T) 
qwooAeuoy ul eipeyAjog | quuoo S90I}J9A JO S9}eUTPIOO-) X9VJ9A 4SII] {to ‘ta ‘So “Ta) 
| -Aauopyy | 
Ps (‘% yaSua] aBpea jo squiooAauoy 0} a3e]01 Os puke ‘Z Aq pordyjnur useq aaey 2]qQe} SIYy) UI SazeUTPIOOD ay} |TV) 


"Yr Ysim parmsossn squorauoy fo 7901 ZS 

















A CONSTRUCTION FOR WYTHOFFIAN POLYTOPES 133 


3.1 Example: @ = Ry, the symmetry group of uniformly packed cubes. 








| | | | | 











| Number | 

Graph | of node Basic vector | C; | Reflecting plane 

: — a =e 

— r, = }(1, 1, 1) Ln 2 x; =0 | 

4 | | | 

| | 2 Pr. = 4/2, 72,0) | 42 | x2 — x3 = 0 

| 3 | rs = $(/2, 0, 0) | $v2 t; — x2 = 0 

| |e | | 

| 4 r, = (0, 0, 0) } x1 = 





Reflections in pi, Pz and p; are equivalent to permuting the coordinates in 
every way, and altering the sign of any coordinate. Reflections in ali four planes 
include the operation of increasing any coordinate by a multiple of 2c. Hence 
the coordinates of the vertices may be written 


(x1, X2, X3) (mod 2c), 


though this may not be the simplest or most elegant form. These points evidently 
form a number of lattices. 

Owing to the fact that the graph is symmetrical, only nine of the fifteen 
Wythoffian derivatives (Table 3.2) are distinct (5, pp. 402-403).* 


*On page 403, the symbols for hd, and he,954 have been accidentally transposed. 


REFERENCES 


1. H.S. M. Coxeter, The pure Archimedean polytopes in six and seven dimensions, Proc. Cam- 
bridge Phil. Soc., 24 (1928), 1-9. 


2. ———., Polytopes with regular-prismatic vertex figures, Phil. Trans. Royal Soc. (A), 229 
(1930), 329-425. 

3. ———., Polytopes with regular-prismatic vertex figures, 11, Proc. London Math. Soc., 34 
(1932), 126-189. 

4. —, Wythoff’s construction for uniform polytopes, Proc. London Math. Soc. (2), 38 


(1935), 327-339. 
5. —, Regular and semiregular polytopes, Math. Z., 46 (1940), 380-407. 
6. ———., Regular polytopes (London, 1948; New York, 1949). 
7. E. L. Elte, The semiregular polytopes of the hyperspaces (Groningen, 1912). 
8. T. Gosset, On the regular and semiregular figures in space of n dimensions, Messenger of 
Math., 29 (1900), 43-48. 
- G. de B. Robinson, On the fundamental region of a group and the family of configurations 
which arise therefrom, J. London Math. Soc., 6 (1931), 70-75. 
10. L. Schlafli, Réduction d'une integrale multiple qui comprend l'arc du cercle et l'aire du triangle 
sphérique comme cas particuliers, J. de Math., 20 (1855), 359-394; Ges. Math. Abh., 
2 (Basel, 1953), 164-190. 


<< 











134 G. C. SHEPHARD 


11. P. H. Schoute, Analytical treatment of the polytopes regularly derived from the regular 
polytopes 1, Ver. der K. Akad. Van Wet. te Amsterdam (I), 11.3 (1911), 1-83. 





12. , Analytical treatment of the polytopes regularly derived from the regular polytopes II, 
III, IV, Ver. der K. Akad. van Wet. te Amsterdam (1), 11.5 (1913), 1-108. 
13. , The characteristic numbers of the prismotope, Proc. Royal Acad. Sci., Amsterdam, 





14 (1911), 424428. 

14. A. Boole Stott, Geometrical deduction of the semiregular from regular polytopes and space 
fillings, Ver der K. Akad. van Wet. te Amsterdam (1), 11.1 (1910). 

15. W. A. Wythoff, A relation between the polytopes of the Ceoo-family, K. Akad. van Wet. te 
Amsterdam, Proc. of the Section of Sciences, 20 (1918), 966-970. 


University of Chicago 


es 





ON THE SYMMETRIES OF SPHERICAL HARMONICS 
BURNETT MEYER 


INTRODUCTION 


Let @ be a finite group of transformations of three-dimensional Euclidean 
space, such that the distance between any two points is preserved by all trans- 
formations of the group. Such a group is a group of orthogonal linear trans- 
formations of three variables, or, geometrically speaking, a group of rotations 
and rotatory inversions. Thirty-two groups of this type are important in 
crystallography and are known as the crystallographic classes. 

A function is said to have the symmetry of a given group if it remains invariant 
under all transformations of the group. Our problem is to determine all spherical 
harmonics of a given degree m and a given symmetry. It is sufficient to find a 
basis of these harmonics for all m and for all groups G. 

Section I of this paper enumerates and classifies all groups of the desired 
type. In §II we find the number of elements in a basis of all homogeneous 
polynomials of a given degree which have a given symmetry, applying a theorem 
of Molien. 

In §III we find the number of elements in a basis of all spherical harmonics 
of a given degree which have a given symmetry. This is accomplished by 
associating with each group a generating function. 

In §IV we solve the problem proposed, using the results of §III. The required 
basis is found in terms of partial derivatives of 1/r, r denoting the distance from 
the origin. For certain simpler symmetries the basis is also expressed in terms 
of the associated Legendre functions. 

A particular case of this problem arose and was solved in another research 
problem, the aim of which was to compute approximately the electrostatic 
capacity of the cube (12, pp. 76-78). In generalizing this particular case, we 
were led to our results which were announced, without proof, in two notes 
(10; 11). 

Work in this problem has been done previously by Poole (13), Laporte (7), 
Bethe (1), Ehlert (4), and Hodgkinson (6), and recently by Stiefel (15). 
For the geometrical and algebraic background see Molien (9) and the biblio- 
graphies in Coxeter (3a) and Speiser (14). 

The present paper differs in two respects from preceding work on the subject. 
First, all groups are treated in a uniform manner, whereas previous papers 
are restricted to certain groups. Second, the generating function of §III enables 


Received May 4, 1953. Sponsored in part by the Office of Naval Research. Portions of this 
paper are a condensation of a dissertation, prepared under the direction of Professor George 
Pélya and submitted to Stanford University in June, 1949 in partial fulfillment of the require- 
ments for the degree of Doctor of Philosophy. 


135 











136 BURNETT MEYER 


us to discuss fully the question of linear independence and to link the subject to 
general theorems of group theory and analysis. 

The major portion of this paper consists of material contained in the doctoral 
dissertation of the author. He wishes to thank Professor George Pélya for 
suggesting the problem and for his helpful guidance. In addition, some results 
are presented which were obtained recently by Professor Pélya and by the 
author. 


I. Groups 


1. Rotations and rotatory inversions. The purpose of this section is to 
enumerate and classify all finite groups of distance-preserving transformations 
of three-dimensional Euclidean space into itself which leave one point fixed. 
The elements of such groups are either rotations or rotatory inversions. A rotatory 
inversion is a rotation followed by central symmetry with respect to a point 
on the axis of rotation. 

Without loss of generality the origin may be chosen as the fixed point. Our 
problem may then be restated in algebraic form: We seek all finite groups of 
orthogonal linear transformations in x, y, and z. 

The matrix of a rotatory inversion may be written JR, in which 


—-1 0 90 
J= 0-1 0O 
0 O-!1 


and R is a rotation. We sometimes use the term ‘“‘rotatory inversion with angle 
6.” This means a rotation through angle 0, followed by central symmetry with 
respect to the origin. 


2. Three types of finite groups. For proofs of statements in this section, 
see Weyl (17, pp. 77-80, 149-156). 

(a). Type 1: Groups consisting of rotations only. We first consider groups, 
all the elements of which are rotations. It has been proved by Felix Klein that 
there are only five classes of groups of this type. They are ©,, D,, T, O, and &, 
the cyclic, dihedral, tetrahedral, octahedral, and icosahedral groups, respectively. 
In the following paragraphs we will indicate how the rotational axes of each 
group are to be placed with respect to the x, y, and z axes. 

For the group ©,, the -fold axis will be taken as the z-axis. 

For D,, the 2-fold axis will also be the z-axis and one of the 2-fold axes will 
be the x-axis. 

All rotations of the group © transform a cube (or a regular octahedron) 
into itself. In this paper the cube will be placed with its centre at the origin 
and with its faces parallel to the coordinate planes. 

The tetrahedral group, T, consists of the rotations which transform a regular 
tetrahedron into itself. The tetrahedron will be placed so that its 3-fold axes 
coincide with the 3-fold axes of ©, and the 2-fold axes will be taken as the 
coordinate axes. 


' 





ON THE SYMMETRIES OF SPHERICAL HARMONICS 137 


The icosahedral group, 3, consists of the rotations which transform a regular 
icosahedron (or a regular dodecahedron) into itself. We shall place the icosa- 
hedron, as explained in Coxeter (3, pp. 52-53), with its centre at the origin, 
and the coordinate axes passing through the midpoints of opposite edges in 
such a way that the edges through which the x-axis passes are parallel to the 
y-axis. 

(b). Groups containing rotatory inversions. There are two types of groups 
containing rotatory inversions. 

The groups of Type 2 are those with a centre of symmetry and are obtained 
by adjoining J to a group of Type 1. If g is the order of a rotational group @, 
then 2g is the order of G,, the group of Type 2 derived from @. The groups of 
Type 2 are Gis, Dai, Ti, Os, and J. 

The groups of Type 3 are derived from a rotational group Gz, which has a 
subgroup, @,, of index 2. We denote by @,[G, the group consisting of the 
rotations of G, and all elements of the form JR, R being a rotation in G, but 
not in @,. 


3. Crystallographic classes. All finite groups of distance-preserving 
transformations have been enumerated in the preceding section. Certain of 
these groups are important in crystallography and are known as the crystallo- 
graphic classes. The transformations in such groups must not only be distance- 
preserving but they must also transform a point lattice into itself. It can be 
shown (17, pp. 98-104) that of the groups previously discussed only those 
having all their axes of rotation or rotatory inversion of orders 2, 3, 4, and 6 
are crystallographic classes. There are thirty-two such groups. 


Il. INVARIANT POLYNOMIALS 


1. The generating function of Molien. Let @ be one of the finite groups 
of orthogonal linear transformations in x, y, and z discussed in §I. Given a 
non-negative integer m, we consider those homogeneous polynomials of degree 
m which have the symmetry of G; that is, they are invariant with respect to 
all transformations of @. We define an invariant basis of degree m for @ as a 
finite subset of these polynomials, the elements of which are linearly independent 
but on which all other invariant polynomials of degree m are linearly dependent. 

The number of polynomials constituting such a basis depends on m and G, 
but not on the particular subset chosen; we call this number g,,. The purpose 
of this section is to determine g,, for arbitrary m and for all the groups discussed 
in §I. 


Molien (9) solved this problem by finding a generating function 


g(t) = > gmt”. 


m=() 
Since his derivation can be simplified somewhat, a proof will be given in the 
following section. 











138 BURNETT MEYER 


2. Molien’s theorem. In the following proof due to Burnside (2, p. 300) 
we use the terminology and notation of Macduffee (8, pp. 17-19). 

Let @ be an abstract group of order m, and let [ be a representation of it 
by linear transformations of k variables; such a representation is said to be of 
degree k. Let the matrices of T be Ai, Ao, .. ., An. 

We denote by f,(¢) the polynomial det (E — tA,), which we shall call the 
characteristic polynomial of A,;. Then 


ft) = det (E — tA,) = u-14- 1— x(A,it+..., 


j=l 
in which x(A,) is the trace and the \, are the characteristic roots of A ,. 

Now consider P,,(A ,), the mth power-matrix (8, pp. 84-87) of A ;. The charac- 
teristic roots of P,,(A ,) are all the possible products of the mth degree of powers 
of Au, As, . . . » Ae. The trace of P,,(A,) is, then, the sum of all possible products 
of the mth degree of powers of the X’s; that is, it is the coefficient of ¢" in the 
expansion of 


l 1 


(— Awd — Av)... — at) AO 





in powers of ¢. 

The matrices P,,(A1), Pn(Az), ..., Pm(Ax) form a group I, which is iso- 
morphic with I; that is, it also represents G. Our problem is now to find the 
number of independent invariant linear forms of I, since these are an invariant 
basis of I of degree m. But the number of independent invariant linear forms 
of a group of matrices is equal to the sum of the traces of the matrices, divided 
by the order of the group (14, pp. 158-161). Therefore, g,, is the coefficient of 
?™ in the expansion of 


I< | 


n = f.(t) 
in powers of t. But since this proof is valid for all m we have proved 


MOLIEN’S THEOREM: Let g, be the number of elements of an invariant basis 
of degree m for a group © of order n of orthogonal matrices. Let f(t) (i = 1, 2, 
..., 0) be the characteristic polynomials of the matrices of ©. Then 

: nm ie 1 
st!) = 2, ee re ne Faby 

3. Characteristic polynomials. We have seen that the generating function 
g(t) corresponding to the group @ is the arithmetic mean of reciprocals of the 
characteristic polynomials of the matrices which are the elements of G. 

Let A be the matrix of a rotation or a rotatory inversion of angle @ of @ 
with a given axis. We may perform the transformation A in the following 
manner: First, a rotation S may be performed that brings the axis of rotation 
or of rotatory inversion of A to coincidence with the z-axis; second, a trans- 
formation A’ may be performed, A’ being a rotation or rotatory inversion with 





ON THE SYMMETRIES OF SPHERICAL HARMONICS 139 


angle @ about the z-axis; then the rotation S—' is performed. Hence, A = S-'A’S. 
Thus, A and A’ are equivalent matrices and have the same characteristic 
polynomial (14, pp. 147-148). We may, then, in computing the contribution 
of any orthogonal matrix to the generating function, always take the axis of 
rotation or rotatory inversion as the z-axis. 

The characteristic polynomial for rotation through an angle @ is 


100 cos 6 — sin @ 0 
det} | 010)—¢\|sin@ cos@0]}]| = (1 — t)(1 — 2tcos@ +). 
001 0 0 1 
For rotatory inversion with angle 6, the characteristic polynomial is 
[ 100 —1 0 O\f/cos é@ — siné 0 
det}{|010]—¢| O-1 Of sin@ cosé@0]}| = (1+2)(1 + 2tcosé+?). 
| 001 0 0-1 0 0 1 


If it is necessary to distinguish between the generating functions of several 
groups @,, G., ..., the notations g(t; G,), g(t; Gs), . . . will be used. 


4. Groups of Type 1. By computing the generating functions for a few 

small values of n, it is conjectured that the generating function for &, is 
1+? 
t; 6.) = ——_———_ or - 
et; @) = Goa Aa?) 

This may be verified by factoring the denominator of the above fraction into 
linear factors involving the mth roots of unity, expanding the fraction into a 
sum of partial fractions, and recombining these in pairs to obtain 


i 1 

nSZo (1 — t){1 — 2t cos (2xv/n) + 2°} ° 
The above expansion is clearly g(t; €,). The algebraic calculations in the fore- 
going partial fraction expansion are elementary but tedious, as the cases of n 
even and m odd must be considered separately in the intermediate stages. 

The generating functions of all groups of Type 1 can be expressed as sums 

and differences of those of ©, for various values of n. Suppose that @ is of order 
n and has p, different q:-fold axes, p» different g2-fold axes, etc. Then, 


g(t; G) = (1/n)[piqig(t; Gy.) 
+ pogog(t; Gy.) +... — (Pit pet... — I) glt; Gd); 


observe that the group ©, contains only the identity element. 


5. Groups of Type 2. It will be recalled that the group G, has a subgroup 
%, which consists entirely of rotations. For every rotation of angle @ of G, 
the group “, has a rotatory inversion of angle @. But the contribution of a 
rotatory inversion of angle @ to the generating function is obtained by changing 











140 BURNETT MEYER 


t to —t in the function corresponding to a rotation of angle 6. Hence, if m is the 
order of the group G, 


g(t; G,) = (1/2n)[mg(t; G) + ng(—t; G)] = Fle; G) + g(—#; G)], 
that is, the even part of g(t; G). 


6. Groups of Type 3. It will be recalled that the group G,[G, has the 
following structure: G, is a rotational group and @, a subgroup of index 2. 
@,[G, consists of the rotations of G, plus a rotatory inversion with angle @ 
corresponding to each rotation of angle @ belonging to G, but not to G,. There- 
fore, if G, is of order n, 


g(t; Gi[G.) = (1/2n)[mg(t; Gi) + 2ng(—t; G.) —ng(—t; G,)] 


= $[g(t; Gi) — g(—t; G.)] + g(—2; G2). 
The first term is the odd part of g(t; G,). 


Ill. THe GENERATING FUNCTION 
1. Invariant harmonic basis. Let © be one of the finite groups discussed 
in §1. We define an invariant harmonic basis of degree m for @ as a set of linearly 
independent spherical harmonics of degree m which are invariants of © and 
on which all other invariant spherical harmonics of degree m are linearly 
dependent. Let h,, be the number of elements in an invariant harmonic basis 
of degree m for G. We wish to determine h,, for arbitrary m. 


2. The main theorem. The main theorem of this section obtains a generat- 
ing function for the h,,; this generating function is surprisingly similar in form 
to the generating function of Molien, as is seen in the 


THEOREM. Let @ be a finite group of orthogonal linear transformations in 
x, y, and z. Let 
h(t) = >. | 
m=0 
in which h,, is the number of elements in an invariant harmonic basis for © of 
degree m. Then 
h(t) = (1 — #°)g(t), 


in which g(t) is the generating function of Molien. 


3. Operations preserving invariance. Before proceeding to the proof of 
the main theorem, we shall need some lemmas. 


LemMMA 1. Let P(x, y,2z) and Q(x, y, z) be homogeneous polynomials which are 
invariants of ©. Then P + Q and PQ are also homogeneous invariant polynomials. 


LemMaA 2. Let P(x, y,2) and Q(x, y,2) be homogeneous polynomial invariants 
of a group & of orthogonal linear transformations. Then the polynomial 





i a | 


ON THE SYMMETRIES OF SPHERICAL HARMONICS 141 


; 7] 
(1) R(x, y,2) = P( 2, = 2) Q(x, y, 2) 
is also a homogeneous polynomial invariant of ©. 


Proof. Let A represent the matrix of any orthogonal transformation of . 
By the chain rule of differentiation, it is easily seen that the operators 


are contragredient with x, y, z (16, pp. 149-154). But since A is an orthogonal 
matrix, the operators are cogredient with the variables. Hence, the right member 
of (1) is an invariant of A, and, therefore, of every transformation in @. 

Of course, the polynomial R(x, y, z) in the above lemma may turn out to be 
identically zero. 

Since x* + y’ + 2 is an invariant of all orthogonal matrices, the above 
lemmas have the following special cases, which will be of importance in the 
following sections: 


LemMA 3. If P(x, y,2) is a homogeneous polynomial invariant of &, of degree 
m, then (x* + y? + 2°)? P(x, y, z) is also a homogeneous polynomial invariant of 
of degree m + 2]; we let j denote a non-negative integer. 


LemMA 4. If P(x, ¥,2) is a homogeneous polynomial invariant of © of degree 
m, then AP(x,y,2) is also a homogeneous polynomial invariant of © of degree 
m — 2; we let 4 denote the Laplace operator, 

.e. - 
ax? + ay + 327° 

4. Proof of the first part of the main theorem. We shall prove the main 
theorem in the form 

Sm = lm + Itm—2 + Ima +... 
In this first part we shall prove that 
gn < ha + has + hac +... 


For any degree m, an invariant basis has g,, elements and an invariant harmonic 
basis has h,, elements. For brevity, let h,, = a and g, = a + 8. We shall choose 
first an invariant harmonic basis of degree m, say Hi, Hz, ..., Ha. Then 6 
homogeneous polynomials of degree m, say I;, Io, . . . , Is, are chosen so that 
the complete set 

aS SS eee 
form an invariant basis. 
Let P(x, y, z) be an invariant homogeneous polynomial of & of degree m. Then 


P = aii, + aH, + eee + afl, + dt; + bels a eee oe bels. 
Furthermore, P is harmonic if and only if 5; = b, = ... = bg = 0. That is, 











142 BURNETT MEYER 


AP = db, AI, + beAlT2 +... + dgAl, = 0 


only if b; = bs =... =bg = 0. But this means that AJ;, Alo, ..., Als are 
linearly independent homogeneous polynomials of degree m — 2 and are in- 
variants of © by Lemma 4 of the last section. Hence, 8 = gm — hin < Zm—2, OF 
Zm < hm + Zm-2. By repeated application of this inequality to gm—2, gm—«, 
we obtain 

Sa & Bn + Rune + Bn e.-:; 


and the first part of the main theorem is proved. 


5. Proof of the second part of the main theorem. In this section we shall 
show that 
(2) Sa > Ba + Bu~e + Bn +...» 


This result, combined with that of the last section, will prove the main theorem. 
The inequality (2) will be proved if we can construct 


hm + hms + him—< + +. 


invariant homogeneous polynomials of degree m which are linearly independent. 
For this construction, we take first 4, independent invariant harmonics of 
degree m, then h,,2 invariant harmonics of degree m — 2 each multiplied by 
x? + y® + 2*, then h,,_, invariant harmonics of degree m — 4 each multiplied by 
(x* + y? + 2°)?, and soon. Thus we obtain hy, + hm—2 + hms + ... homogeneous 
polynomials of degree m, all invariant by Lemma 3. It remains to show that 
they are linearly independent. 

Let %_, be a linear combination of the h,,., harmonics selected above, 
containing h,,_, arbitrary constants, u = 0, 2, 4, . . . . We wish to show that if 


(3) M4+(?+y4+27) M.¢+ (+747) Mit+...=0 


identically, all the hy + hn—2 + Ams +... constants are necessarily zero. 

To this end we multiply equation (3) by %_,, in which » takes one of the 
values 0, 2, 4, . . . , and integrate the resulting equation over the surface of the 
unit sphere. Because of the orthogonality of surface harmonics of different 
degrees on the unit sphere, 


ffm es he + He, Hot... + Ge, ..)do = Sf H.de =0. 


Therefore, %_, vanishes on the surface of the unit sphere. It follows, from 
the uniqueness theorem for harmonic functions, that %_, vanishes identically. 
But if %._, vanishes identically, then the h,,., constants it contains must all 
be zero. Now let » assume, in turn, the values 0, 2, 4, . . . . Thus, all the /,, + 
him—2 + hm—« . . . constants involved vanish, and the second part of the main 
theorem is proved. 


6. Tables of generating functions. The factor 1 — /? appears in the denom- 
inators of all the generating functions g(?) found in §I1. Since h(t) = (1 — &)g(t), 





ON THE SYMMETRIES OF SPHERICAL HARMONICS 143 


TABLE | 
GENERATING FUNCTIONS OF THE FINITE GROUPS 


(The absence of planes of symmetry is indicated by — or 0, according as the 
group is purely rotational or skew.) 


























| m even n odd | h(t) p 
‘res ea aeenamemeesnn cams — ecm 
| es 1 ; 
se | 1-f 1-8 ” 
; a 1¢e 
. 1 ag 1 can t° | 
~ l 1 
§ - — ri 
-“ i-f i-f ° 
S kh. itf | 
7 li-f 1-f | 
1 1+? 
te i-f i1-f? . 
1 1 
—  -— ¢ 
= i-f i-? : 
1 1+¢ 
z i-f 1-¢f 
~ ma = 
CaID i-t 1-# 
. i+? 
5 ——_ ance _ 
Cn 1-—t 1-—?¢ 
1 1 
> D — = = = 
Q ni Dn lDeon Sm J eS, n+l 
- 1 1 ao pr 
es i-f i-f | 
~ 1 1 + ft 
D, [Don Vni ee i_-f n | 
" — 1+? 
OS G,[ don EF; 1 ri t" l 
‘ g 1 oft +7" 
| €,[ Co, Cas © 2 © ml is await | 0 





we find h(t) at once. Table I gives the generating functions /(¢) for all finite 
groups. 


7. Further properties of /(/). Besides the basic property of A(t) just 
discussed, this function shows, in various ways, the structure of the abstract 
group and the geometrical properties of the symmetry. The proofs of the 
following theorems are omitted for brevity; the theorems may be verified by 
consulting Table I. 











144 BURNETT MEYER 


THEOREM I. . 
lim (1 — t)? A(t) = = 
t+1 n 
n being the order of the group. 
THEOREM 2. 
lim i A(t) = lorO 


tro 


according as ®& is or is not of Type 1. 
THEOREM 3. The function h(t) is an even function if and only if © is of Type 2. 


THEOREM 4. Let G, C G, and let h,(t) and h(t) be their respective generating 
functions. Then 
hy(t) > h(t). 


Besides the classification of the finite groups given in §1, the groups of Type 
2 and 3 may also be classified according to the number of planes of symmetry. 
The groups generated by planes of symmetry are 


€,[G:, C,[D,., Dain even), D,lDen(n odd), T[, Di Jt 


They have, respectively, 1, m, m + 1, » + 1, 6, 9, and 15 planes of symmetry 
which divide the space into 2, 2m, 4n, 4n, 24, 48, and 120 compartments, so 
that the number of compartments equals the order of the group. 

Groups of Types 2 and 3 which have no plane of symmetry are called skew 
groups. They are ©,[€:, (” even) and ©,, ( odd). Let » denote the number of 
planes of symmetry in the group. 


THEOREM 5. 
lim ?** A(t) = 1 


tsa 


if, and only if, © is not a skew group. 


THEOREM 6. The function h(t) vanishes for a finite value of t, not a root of 
unity, if and only if © is skew. 


THEOREM 7. The function h(t) vanishes at no finite point if and only if © 
is generated by reflections. 


THEOREM 8. Let & be a group generated by reflections, G, the purely rotational 
subgroup of & of index 2, h(t) and h,(t) their respective generating functions. Then 
hy(t) = (1 +”) h(t). 


IV. INVARIANT HARMONIC BASES 


1. Invariant basis and invariant harmonic basis. Let © denote one of 
the groups discussed in I, and let %, denote the set of spherical harmonics 
of degree m which form an invariant harmonic basis of degree m for G. Then, 
by the proof of the second part of the main theorem of §III, the set 





l 


yf 


ON THE SYMMETRIES OF SPHERICAL HARMONICS 145 


[me, (x? +y? +2) Mis, (x? +y' +2")? Mi, ...] 
is an invariant basis of degree m for G. In this manner, the problem of finding 
an invariant basis of degree m is reduced to that of finding an invariant harmonic 
basis of each of the degrees m, m — 2,m — 4,.... 

The remainder of this paper will be devoted to finding invariant harmonic 
bases of arbitrary degree for each of the finite groups of §1. We shall construct 
the bases using the Maxwell representation of spherical harmonics in terms of 
partial derivatives of 1/r. For the cyclic and dihedral groups and groups of 
Types 2 and 3 derived from them we shall develop an equivalent representation 
of the bases, which is simpler in certain respects, in terms of the associated 
Legendre functions. 


2. Invariant harmonic bases in the Maxwell representation. 


Operating basis. Let Q(x,y,z) be a homogeneous polynomial of degree m in 


x, y, z. Then 
(2, 2 ay’ *) 


is a differential operator formed by replacing the variables of the polynomial by 
the appropriate partial derivators. Let r = (x* +y* + 2*)!. The expression 


2m+1 a ¢@40\1 
(1) ; (2. ay’ 2) r 
is a spherical harmonic of degree m or is identically zero. Furthermore, the 
above expression is identically zero if and only if Q(x,y,z) is divisible by x* + 
yi +2 (5, pp. 127-129.) The polynomial Q(x,y,z) will be called the operating 
polynomial corresponding to the spherical harmonic (1). If Q is an invariant 
of a group @ of orthogonal linear transformations, then the spherical harmonic 
(1) will be also an invariant of G by Lemma 2, §III. 
These considerations lead to the following basic 


THEOREM. Let 
Q:(x,y,2), Qe(x,y,2), ---, Qn (x,y,2) 


be hy homogeneous polynomials of degree m which are invariants of © and which 
are linearly independent mod (x* + y*® +2*). Then the h», spherical harmonics 


. ame {2. @ 2) ! —e 
(2) r Q ax’ ay’ a (po) hm) 
form an invariant harmonic basis of degree m for ®. 


The statement that the Q,. are linearly independent mod (x* + y® + 2’) 
means that a linear combination of them, 


» AQ, 








146 BURNETT MEYER 


is divisible by x* + y* +2* only if A, = 0 for all 7. We call Q:, Qo, ..-, Qa. an 
operating basis of degree m for G. 


3. Invariant harmonic bases for ©,, D,, and derived groups. A result 
of §III, 


° on __1+Ff — —_ —1 n n Qn 
hts @) = Gopa—A- Ny a+fyi+e+er+...) 
(1 — #)* (1 + OF + Qe" ++ ...), 


suggests that we seek an invariant operating polynomial for ©, of degree one, 
and two invariant operating polynomials of each of the degrees m, 2m, 3n,... . 
Let p = (x* + y*)?. Since the rotation generating ©,, expressed in spherical 
coordinates, is ¢ — @ + 2x/n, the polynomials 


z, p" cos n@, p" sin nd, p'" cos 2ng@, p™ sin 2n¢@, ... 
are clearly invariants of ©,. For simplicity of notation, let 
C, = p’cosng = x" — (2) x" yy’? + (i) x “*y'* —..., 
C’, = p"sinng = (3) x" y — (3) x y* + Gx” y® —.... 


We shall call z, C,, and C’, the fundamental operating polynomials for G,. 
We consider the expression 


(3) (Q—2s)* (+ Q+ Cut Cut Cm t+...) 
=(lt+e4+2+...)(1+ G+ Crt Cunt Cnt...), 


or rather the double series resulting from term-by-term multiplication of the 
two infinite series in the last line, without any reference to convergence, as the 
set of its terms. Each term is an invariant operating polynomial for ©,. That is, 
our assertion is that an operating basis consists of the following 2|m/n] + 1 
polynomials: 

(4) it a, ee es | 


By the theorem of the last section, all that remains to be shown is that the 
polynomials (4) are linearly independent mod (x? + y? +2’). In this section 
and the next two, we shall find series analogous to (3) for all finite groups. Such a 
series constitutes a complete solution of our problem. 

It remains to show that the polynomials of the set (4) are linearly independent 
mod (x? + y’ +2”); that is, we wish to show 


Im /n) 


(5) Ag” + > 2”"(A,Cm + BLC'm) =0 mod(x* + y’ + 2’) 


v=l 
only if Ag = A; => Az=...=B,=B,=...=0. 
Without loss of generality, we may assume that m is even, since if m is odd, 
we may multiply the congruence by z, obtaining an equivalent congruence. 
We first consider the case of m even. Then the left member of (5) is even in z. 
The congruence (5) is then equivalent to the equation 








eH YE 


ON THE SYMMETRIES OF SPHERICAL HARMONICS 147 


Im /n} 


(6) Ag(—x* — y*)™ + DO (—x* — y*)*"" (ALC + BC’ nm) = 0 


val 
identically. Now if we let x = cos ¢, y = sin ¢, (6) becomes 


[m /n) 


(7) (—1)"Ao + >> (—1)*""'""(A, cos ong + B, sin yng) = 0. 
val 
If equation (7) is multiplied, in turn, by 
(8) 1, cos¢, sin ¢, cos2¢, sin 2¢, 


and the resulting equation is integrated from — z to 7, we see, by the orthogon- 
ality of the set (8), that all the coefficients A, and B, must be zero, which was 
the aim of our proof. 

We now consider the case of m odd. Then, equation (5) has the form 


(9) F,(x, y, 2°) + 2F2(x, y, 2°) = 0 mod (x* + y* + 2°). 
The above congruence remains true if we substitute — z for z, 
(10) F,(x, y, 2°) — 2F2(x, y, 2°) = 0 mod (x* + y* + 2’). 
If we now add and subtract (9) and (10), we obtain 
Fi (x, y, 2’) =0 
(11) mod (x* + y*® + 2’). 
F,(x, y, 2°) = 0 


The proof may now be completed in a similar manner to the case of m even, 
since both congruences of (11) are of the form (5) (with changed values for 
the parameters m and n). 

We pass to the group D,. We assume that one of the 2-fold axes, perpendicular 
to the n-fold axis, is the x-axis, as explained in §1. Then we obtain the group 
D, by adjoining to ©, the matrix 


1 0 90 
K={0-1 0 
0 0-1 


or ¢— — ¢ and @— 7 — 86; that is, the group consists of the matrices of ©, and 
the coset K@,. Hence, an invariant of ©, which is also an invariant of K will 
be an invariant of D,. Upon applying the transformation K to the fundamental 
operating polynomials of ©,, we note that 


2—>—2, Con—>Con, Com > — Cin (w= 1,2,3,...). 


Hence, 2”, Cin, and zC’,,, are invariants of D,. 
In §III the generating function for D, was found to be 


. = 1+/*" i —_ ry n+1 n Qn 
ht; D) = Gay = 4 Py ate a+e+e +...) 


(1 _ ry (1 + tr" + fh 4 = + gat + rn 7 














148 BURNETT MEYER 


We consider now the expression 


(12) (1 — 2°)" (1+ CG, + 2C', + Co + 2C'n +...) 


in the same way as we did (3), namely as the set of all the terms.of the double 
series resulting from the term-by-term multiplication of the two factors. It 
is clear, by comparison of the above expression with the generating function, 
that we have the correct number of operating polynomials for each degree m. 
We have just seen that each term in the series is an invariant of D,, and, since 
the above operating polynomials are a subset of those found for ©,, their 
linear independence mod (x* + y* + 2*) has already been proved. 

Similarly, the invariant operating polynomials for the other groups derived 
from ©, will be a subset of those for ©, and may be found in a completely 
analogous manner. In all cases the symmetry elements of these groups will be 
placed, with relation to the coordinate axes, as explained in §1, 2. In Table II, 
the matrix which generates each group from an appropriate subgroup (either 
€, or D,) is given. The transformation in spherical coordinates corresponding 
to this matrix is also given. Table III shows which of the invariant operating 
polynomials for ©, are also invariants for each of the groups. In this table, + 
means that the expression is an invariant; — means that the expression is not 
an invariant. In most cases (all except D,;, and D,[D.,) these expressions 
which are not invariants go over into their negatives upon application of the 
transformation adjoined to the subgroup. Finally, the series analogous to (3) 
and (12) are given in Table IV for each of the groups. For convenience, Tables 
If and IV also include the groups of the Cubic System which will be discussed 
in the next section. 

Thus we see that these three tables, together with Table I, present in tabular 
form, for each of the groups, a complete derivation of the invariant operating 
polynomials similar to that carried out in detail for D,. 


4. Invariant harmonic basis for groups of the Cubic System. In §III 
we found that 


ee 
MD) = GAA) 


With this in mind we seek an invariant of T for each of the degrees three, four, 
and six. Geometrical considerations enable us to find these invariants. 

It is clear that in any purely rotational group G, the set of all axes of rotation 
must be permuted by any transformation of the group; that is, the permutation 
group of these axes form a representation of @. Moreover, it may be that the 
set of all rotational axes of the group may be decomposed into several disjoint 
subsets in such a manner that all rotations of the group permute the axes of each 
of these subsets among themselves. Suppose there are q axes in one of these 
subsets, with direction numbers a,;, 8;, y; (i = 1, 2,...,q). Then the expression 


I] (ax + By + v2) 


~~ FF fF OF = 





ON THE SYMMETRIES OF SPHERICAL HARMONICS 149 


TABLE I! 
MATRICES ADJOINED TO SUBGROUPS TO GENERATE GROUPS 





Transformation | 



































nmeven | Subgroup | Generating matrix in spherical n odd 
coordinates | | 
. a. ¢—--—¢ | | 
D>. ¢, 0 —!1 0 —~x«x-—8 Ds 
® 6-1 | 
1 0 O | 
€,[D, €,, (; —1 ) ¢—-—¢ G.(D, | 
00 1 | | 
1 0 0 | 
Ca G, 0 1 0 O+r-0 | Gln | 
0 0O-! 
| | (: 0 7 
Das > | 0 1 0 6+ xr—-0 | DD, 
0 0-1 | 
-™ = —sin * 0) ¢— o+ 
| n n 
€,[ C:, C,, ¥ = cos~ 0 Ii xr — 0 | G,., 
n n 
| 0 0 -i | 
(cos —sin 2 o) o-me+t | 
n n n 
D,[D>, D, sin s cos x 0 6— x - 80 | Dee 
n n 
\ 0 0 - 1/ 
(° 1 0 | | 
tr D. 0 0 l = 
1 0 of | | 
1 0 0 
D = 0 O-!1 Y 
0 1 O 
1 0 0 | 
iD | (; 0 1 | TID | 
0 1 O | 
Tt, | = J | | &y | 
lS) | © J | | oa | 











is transformed into a constant multiple of itself by any rotation of G. Proper 
choice of the direction numbers can easily be made in order that the above 
expression be an invariant. 











150 BURNETT MEYER 























TABLE III 

INVARIANCE OF OPERATING POLYNOMIALS FOR Gn WITH RESPECT TO DERIVED 
GROUPS 

a= 1,3,5,... 

y9=2,4,6,... 
meven| 2? 2 fon Got Satan! Can Caitat a! soll 
C, + + + + 7T Tt ievs + i+ ¢ €, 
D, + — + — — + + — = + | Ds 
GID. | + +]/+—-|]/+-|]+-|+ - | G® 
Cn + - + + - = + + - = €,[ Co, 
Dri + os + — — =— + _ — = DalDan | 
GlG.| + - | — - | + be) | 
D,[Don + sa | a — _ + + _ | = = Dai | 

| } 

















The regular tetrahedron is placed in the position described in §1, inscribed in 
a cube. Then it is clear that the 13 axes of rotation separate into three disjoint 
subsets—the three axes through the centroids of the faces of the cube, the four 
diagonals of the cube, and the six axes through the midpoints of the edges of 
the cube. Corresponding to these three sets of axes are the polynomials 


O; = xyz 

O,= xt+yt+2)(—x+y¥4+2)(%@ —y+2)e%+ 9-2) 

Os = (x* — y*)(y* — 2°) (2" — x’). 
Since O*, = — 2(x* + y* + 2), mod (x? + y? + 2’), in the following we shall 
use instead 


Og=x'+y' +2". 
It can be easily shown that O;, O., and Og are invariants of T. It now seems 


reasonable to conjecture that the set of invariant operating polynomials for 
T is represented by the expression 


(1 — O3)~* (1 — O:)™"* (1 + 06), 


in the sense illustrated in the foregoing section by the discussions of (3) and 
(12). To prove the above conjecture it remains to show that all terms of a given 











ee Oe 





ON THE SYMMETRIES OF SPHERICAL HARMONICS 151 
degree m are linearly independent mod (x* + y* + 2*). That is, we wish to 
show that 
(13) ym a, Os" Of + O.>, c,° OF 0," = mod (x* + y +2’), 

i 7 
3a, + 48; = 6 + By, + 46, = m (for all ¢ and j), 
only if all a, and c, are zero. 
Case I. If m is odd, 
3a, =a, =m=l1 (mod 2) for all z, 
37, = 7; = m= 1 (mod 2) for all 7. 


Hence, O; is a common factor of the left member of (13), and we may divide 
the congruence by it since 


O; #0 mod (x* + y* + 2°). 
This reduces the problem to 
Case II. If m is even, 
a,=m=0 (mod 2) for all 2, 
¥,=m=0 (mod 2) for all j. 


Hence, the left member of (13) is a polynomial in O;?, O4, and O¢. If now we 
make the substitution 22 = — x? — y*® in (13), the resulting expression must 
vanish identically. Upon making this substitution and absorbing the constant 
factors in the a, and c,, (13) becomes 


(14) DS ag(x*y® + x*y*)'* (x* + x*y? + y*)* 
4. (x” — y’) (x" + 2y") (2x” + y’) 7 c,(x*y’ + x*y' yi (x* + x*y*+-y*)"! 
7 
= 0, 


identically. Our assertion is that all the a, and c, vanish. We assume the contrary 
and obtain a contradiction. 

We may assume that all the a,, c, in (14) are different from zero; otherwise, 
we could simply omit the terms with vanishing coefficients. We may assume 
that a, < a; y1 < y; fori = 2,3,... andj = 2, 3,... ; this is a matter of 
notation. Finally, we may assume that one of the numbers a; and 7; is equal to 
zero; otherwise, we could divide by a suitable power of x*y® + x*y*. Both a; and 
71 cannot vanish; otherwise, we should have 48; = 6 + 46,, 2(8: — 4:) = 3. 
But 3 is not an even number. 

Now we set y = 0 and obtain: 


| 
? 


ay = 0, sothat a, = Oifa; = 


cn = 0, so that a= 0 if i= 


| 
4 


a contradiction in either case. This proves our assertion. 








152 


For the other groups of the Cubic System, the series representing the in- 
variant operating polynomials are derived in the manner shown in the last 


BURNETT MEYER 


section. Tables I, II, IV, and V summarize these derivations completely. 





TABLE IV 
SERIES REPRESENTING OPERATING POLYNOMIALS OF THE FINITE GROUPS 
Polynomials nodd | 
J++ C+ Cut Cnt...) | & 
2) (1+ C, + 2C’, + Ca t+ 2C'n +...) Ds 
s)*(1+C,.+ Ca++...) | ©,[®, 
*)" (114+6C4+C.+Cat+ Cut...) C,[ Con 
27 (1+ C.+ Cat...) D, [D2 
2) (1 + 20, + 20’, + Cont C’m +...) Cai 
#) (1+ 20’, + Cat...) Dar 
Os)" (1 — O4)-* (1 + Os) tr | 
O#) (1 — 04)“ (1 + 0306) 2 | 
O03) (1 — O04)" T[O | 
O?)— (1 — Og)“ (1 + Oo) x, | 
0?) (1 — Oy)" 9; 
Ie)? (1 — Lie)? (1 + Tis) 3 
Ie) (1 — Tie) 3: 


| 
| 
| Rr 























With this in mind we wish to find an invariant of % for each of the degrees six, 
ten, and fifteen. Placing the icosahedron as explained in §I, we find these in- 


5. The icosahedral groups. In §I1I we found that 


k(t, 3) = 7 —; 


r 





' 





ON THE SYMMETRIES OF SPHERICAL HARMONICS 153 
variants in a manner analogous to that used for the groups of the Cubic System. 
In the following, r = 4(1 + +/5). The three invariants are 
Ig = (2°x*® — y*)(r*y? — 2”) (2°2” — x’) 
Tio = (x* + yy + 2) (9 'x?® — oy’) (ry? — 972") (r*e® — 2°*x’) 
Tis = xy2(rx + ry +2)(—rx +7 ty + 2)(rx — ry + s)(rx + r'y—z) 
(x + ry + '2)(—x + ry + o's) (x — ry + 7's) (x + ry — 1'2) 
(rx + y + 12)(—r x + y + 32) (r x — y + 32) (rx + y — 72). 


We now conjecture that the set of invariant operating polynomials for } is 
represented by the series (1 — J¢)~' (1 — Jo) (1 + J1s). 

To prove this conjecture it remains to show that all terms of a given degree 
m are linearly independent mod (x? + y® + 2°). 


Case 1. m even. We wish to show that 
(15) ps ade Tad" = 0 mod (x" a y + 2°), 6a, + 108, = m 
only if all a, are zero. 

Case ll. m odd. We wish to show that 

Is >> ae‘I"' = 0 mod (x* + y’ + 2°), 15 + 6a, + 108, 


only if all a, are zero. Since it can be shown that /,, # 0, mod (x* + y* + 2), 
Case II reduces to Case I. 


m 











TABLE V 
INVARIANCE OF OPERATING POLYNOMIALS FOR T WITH ReEsPECT TO DERIVED 
GROUPS 
Groups| O# O:0s Os 0, Os 

- | + + + *+ * 

= + + + + + 

D | + + - + 
| TO | + - Tv + 

TZ; | + - - + + 











154 BURNETT MEYER 


The remainder of the proof is similar to that of the preceding section and will 
be briefly indicated. As before, we assume all a, different from zero and a; < a, 
fori = 2,3,.... Since 


I, #0 mod (x* + y* + 2°), 
we may divide (15) by J, obtaining an equivalent congruence in which 
a, = 0. If in this congruence we first let 2 = — x* — y* and then let y = rx 


we obtain 
—32r7(r6 + 2° +1) (7? — 3) ax” =0 
identically. Hence a; = 0, a contradiction to the assumption that there were 
any non-zero coefficients in (15). 
It is easily seen that J, and J; are invariants of %, but J;5 is not. Hence, the 
set of invariant operating polynomials for $,; is (1 — J¢)-! (1 — Iie). 


6. Invariant harmonic bases using the associated Legendre functions. 
For the simpler groups, the cyclic, dihedral and related groups, representation 
of the invariant harmonic basis, in terms of the associated Legendre functions, 
is somewhat simpler to derive than in the Maxwell representation. It will be 
shown later that the same basis is obtained, except for constant factors, in 
both representations. 

It is well known that 


(16) P,,,(cos @), 
Px.1(cos 8)cos ¢, Pm.2(cos @)cos2¢,..., Pm.m(cos @)cos m¢, 
Pri (cos 6)sin @, Pm.2(cos @)sin2¢,..., Pm.m(cos @)sin m¢ 


are a set of linearly independent surface harmonics of order m and all others 
are expressible as linear combinations of them. By multiplying each of the 
elements of the set (16) by r”, we obtain an invariant harmonic basis of degree 
m for ©. 

The cyclic group ©, is generated by the transformation ¢— ¢ + 2z/n. 
Of the set (16), the elements 


(17) Pm (cos 6), 
Pn.n(cos 6) cosnd, Pm.»(cos 6) cos 2n¢,..., Pm.sn(cos @) cos sn¢@, 
Pnn(cos @) sinnd, Pm.n(cos @) sin 2nd,..., Pm.n(cos @) sin sn@ 
(sn < m < (s + 1)n) 


are clearly invariants of ©,. There are 2[m/n] + 1 elements in the set (17), 
and, from III, h,, = 2[m/n] + 1 for ©,. Hence, (17) is an invariant basis of 
surface harmonics of order m for &,. 

The bases for the derived groups are found by selecting those elements of 
(17) which are invariants of the transformation generating the group from 











ON THE SYMMETRIES OF SPHERICAL HARMONICS 155 


@,. It may be verified that in each case the basis obtained has the correct 
number of elements (that is, h,, of the group in question). These derivations 
are summarized in Table VI. 


TABLE VI 


INVARIANT HARMONIC BASES IN TERMS OF THE ASSOCIATED LEGENDRE 
FUNCTIONS 


(In all formulas below, the argument “cos @” is omitted in the associated 
Legendre functions.) 



































w= 1,2,..., [s/n] 
u=1,3,..., [m/n] or [m/n] — 1 (whichever is odd) 
v = 2,4,..., [m/n] or [m/n] — 1 (whichever is even) 
- Rca 3 ” 
Group Basis 
C,, Pn, Pm.vn COS WN$, Pm. on SiN WN 
D, (mn even) m even Pn, Pm.wn COS WN} 
m odd Prn.won SiN wnd 
(m odd) m even Pan, Pm. on COS UND, Pnun SiN UND 
m odd Pn.un COS UN, Pm. on SiN UNH 
| €,[D, Pn, Pn.wn COS WN 
| C.: m even Pn, Pm.un COS WNO, Pm. won SiN WN 
m odd None 
| os 
Dat (mn even) m even Pn; P-m.wn COS WN 
m odd None 
(m odd) m even Pay Pm. on COS OND, Pm un Sin UND 
| m odd None 
| €,[ Co, m even Pn, Pm. on COS UN, Pm, on Sin UNH 
m odd Prn.un COS UNG, Pm un SiN UND 
| _— Ferien Lecatiiagl 
| D,l[Da, (nm even) m even Pn» Pm. on COS UND | 
m odd Prn.un Sin UN 
| (nodd) meven Py Pm, on COS UND 
m odd Pn.un COS UN 











156 BURNETT MEYER 


7. Equivalence of the two representations. For the groups ©,, D, and 
those of Types 2 and 3 derived from them, we have found invariant harmonic 
bases in two different representations—in terms of the differential operators 
and in terms of the associated Legendre functions. In all cases, however, the 
bases obtained are the same, except for constant factors. 

This follows from the relationship (5, p. 134). 

F ses ( > y 1 (—1)"""(n — m)! 


an" ™\ ox + yy _ — (cos m@ + isin m@) Py.m(cos 8). 





Now, set k = m, j = n — m, multiply the above equation by r°**', and 
separate into real and imaginary parts: 


j 
(18) pine 2 C, t = (—1)*j! r7*P,.2.2(cos 6) cos k¢, 
and 
2(7+4)+1 a , l = jz) _ stk ™ 2 
(19) r ae C", = (—1) 7! 77°" P s4n.2(cos @) sin k@, 
in which C, and C’, are the differential operators obtained by substituting 
9 9 @ 
dx’ dy’ dz 


for x, y, z in the operating polynomials previously defined. 
The equivalence of the two representations can easily be shown by use of 
(18), (19), and Tables IV and VI. 


REFERENCES 


1. H. Bethe, Termaufspaltung in Kristallen, Ann. Phys. (5), 3, (1929) 133-208. 

2. W. Burnside, Theory of groups of finite order, 2nd ed. (Cambridge, 1911). 

3. H. S. M. Coxeter, Regular polytopes (New York, 1948). 

3a. ———, The product of the generators of a finite group generated by reflections, Duke Math. 
J., 18 (1951), 765-782. 

4. W. Ehlert, Uber das Schwingungs—und Rotationsspektrum einer Molekel vom Typus CH, 
Z. Phys. 51 (1928) 6-33. 


5. E. W. Hobson, The theory of spherical and ellipsoidal harmonics (Cambridge, 1931). 
6. J. Hodgkinson, Harmonic functions with polyhedral symmetry, J. London Math. Soc., 
10 (1935), 221-226. 
7. O. Laporte, Polyhedral harmonics, Z. Naturforsch., 3a (1948), 447-456. 
8. C. C. Macduffee, The theory of matrices (Berlin, 1933). 
9. T. Molien, Uber die Invarianten der linearen Substitutionsgruppen, S. B. Akad. Wiss. 
Berlin, 2 (1897), 1152-1156. 
10. G. Pélya and B. Meyer, Sur les symétries des fonctions sphériques de Laplace, C. R. Acad. 


Sci., Paris, 228 (1949), 28-30. 

11. —, Sur les fonctions sphériques de Laplace de symétrie cristallographique donnée, C.R. 
Acad. Sci., Paris, 228 (1949), 1083-1084. 

12. G. Pélya and G. Szegi, Isoperimetric inequalities in mathematical physics (Princeton, 1951). 

13. E. G. C. Poole, Spherical harmonics having polyhedral symmetry, Proc. London Math. Soc. 
(2), 33 (1932), 435-456. 








“= 


ON THE SYMMETRIES OF SPHERICAL HARMONICS 157 


14. A. Speiser, Die Theorie der Gruppen von endlicher Ordnung (New York, 1945). 

15. E. Stiefel, Two applications of group characters to the solution of boundary-value problems, 
J. Res. Nat. Bur. Standards, 48 (1952), 424-427. 

16. H. W. Turnbull, The theory of determinants, matrices, and invariants (London, 1928). 

17. H. Weyi, Symmetry (Princeton, 1952). 


University of Arizona 











SUR CERTAINS SOUS-ESPACES VECTORIELS DE L’ 
A. GROTHENDIECK 


Le théoréme qui suit résoud une question qui m’avait été posée indépendam- 
ment par M. H. Mirkil et Professeur E. Farah, dans le cas particulier de p = 2 : 


THEOREME 1. Soit M un espace localement compact muni d'une mesure bornée 
np, et sat 1 <q p< + o@. Soit H un sous-espace vectoriel de L™(y), fermé dans 
L?(u). Alors H est de dimension finie. 


DEMONSTRATION. — Du théoréme du graphe fermé résulte que l’application 
identique de H, muni de la topologie induite par L’, dans L”, est continue, 
donc sur H la topologie induite par ZL? ou par L” est la méme, donc aussi iden- 
tique a celle induite par L‘ avec p < g < + @. Par suite, H est aussi un sous- 
espace vectoriel complet, donc fermé, de L*. Cela nous permet déja de supposer 
pb > 1 (sinon on remplace par g > p). Alors L? est réflexif, donc H est réflexif. 
Par suite, l’application identique de H dans L” et de L® dans L? est faiblement 
compacte. Or on sait qu'une application linéaire faiblement compacte d’un 
espace L® dans un espace localement convexe séparé transforme les parties 
faiblement compactes de l’espace de Banach L” en des parties compactes 
(2, th. 1, page 139). Il en résulte que la boule unité de H est une partie compacte 
de L?, donc de H. Par suite H est de dimension finie. 


COROLLAIRE. Soit M un espace localement compact muni d'une mesure pu 
quelconque et soitl1 < p < + ©. Soit H un sous-espace vectoriel fermé de L? (yu), 
et fo € L?(u), tels que toute f € H soit majorée en module par un multiple de fo. 
Alors H est de dimension finie. 


(Bien entendu, le th. 1 est un cas particulier du corollaire.) Soit dy = (f9)?du 
la mesure bornée de densité (fo)? par rapport 4 yw. L’application f — f/f, (od 
on convient de prendre f(t)/fo(t) = 0 pour fo(t) = 0) est manifestement un 
isomorphisme métrique de H dans L?(v), et l'image de H est contenue dans 
L”(v). D’aprés le théoréme 1, cette image est donc de dimension finie, et par 
suite il en est de méme de H. 

Remarquons d’ailleurs que la conclusion du lemme subsiste si on suppose 
qu'il existe une suite (f,;) d’éléments positifs de L?(u) telle que toute f ¢H 
soit majoré en module par un multiple d’une des f;. En effet, posant alors 
fe= di 2-' f./\lf,| , fo est un élément de L?(u) satisfaisant aux conditions du 
corollaire. 

Le théoréme qui suit montre dans quelle mesure le théoréme 1 est le meilleur 
de son genre: 





Recu le 27 octobre, 1953. 








‘ur 








SUR CERTAINS ESPACES VECTORIELS 159 


THEOREME 2. Soit M un espace localement compact muni d'une mesure y- 

(1) St uw n'est pas discréte (i.e. somme d'une famille de masses ponctuelles), 
alors il existe un espace vectoriel H formé de fonctions appartenant a tous les L(y) 
pour 1 < p < + ©, fermé dans tous ces espaces. 

(2) Si uw est discréte, alors tout espace vectoriel H contenu et fermé a la fois dans 
L?(u) et L*(u), avec 1 < p,q < + @ et p # q, est de dimension finie. 


DEMONSTRATION. — (1) On sait qu'il existe une partie compacte de M, de 
mesure non nulle, dont tout point a une mesure nulle. Cela nous raméne aussi- 
t6t au cas ol M est lui-méme un compact de mesure non nulle, dont tous les 
points ont une mesure nulle. De plus, on peut évidemment supposer y» positive 
et de masse totale égale 4 1. On montre alors qu'il existe une application con- 
tinue @ de M sur I'intervalle (0,1) = Mo, telle que ¢(u) soit la mesure de 
Lebesgue yuo. (Nous admettrons ce fait, probablement connu, qui se démontre de 
méme facon que le théoréme d’Urysohn sur I’existence de fonctions continues 
sur un espace normal.) Alors, pour 1 < p < ©, f—> fo est un isomorphisme 
métrique de L(y») dans L?(u), on est par suite aussit6t ramené au cas od 
M = Mo, u = uo. Mais soit alors (m,) une suite d’entiers tels que m,4:/m, > X, 
ot A est une constante >1, soit H le sous-espace vectoriel fermé de L(y) 
engendré par les fonctions exp (2in,x), i.e. l’espace des fonctions sommables 
dont les coefficients de Fourier sont nuls pour les indices différents des n,. 
Il est bien connu (3) que toute f € H appartient a tout L’(u) pour 1 < p < + @. 
Il en résulte aussit6t que H est aussi fermé dans tous ces espaces (grface au 
théoréme du graphe fermé). 

(2) On est ramené aussit6t au cas ol M est un espace discret, chaque m € M 
étant muni d’une masse c,, > 0. Comme sur H les topologies induites par L” 
et L* soni identiques (th. du graphe fermé), on est ramené a prouver que le 
sous-espace vectoriel fermé H, de H engendré par une suite (f;) quelconque est 
de dimension finie. Or, les f; seront toutes nulles en dehors d’une méme partie 
dénombrable M, de M, il en sera donc de méme de toute f € Hy. Cela nous 
raméne au cas ot! M est dénombrable, et on peut évidemment supposer M infini. 
Mais dans ce cas, L” est isomorphe 4 Il’espace 7’ des suites de puissance p-éme 
sommable. C’est trivial si p = + ©, et pour p < + © Il’égalité 


(> f(my cn)” = (S (en'” f(m))’)'” 


nous montre que l’application f — yf, ot y(m) = c,,'”, est un isomorphisme 
métrique de L? sur 7’. D’autre part, d’aprés un théoréme de Banach (1, chap. 
XII, §2, th. 1), tout sous-espace vectoriel fermé de dimension infinie de ? 
contient un sous-espace vectoriel isomorphe a /?. Donc si H était de dimension 
infinie, serait isomorphe a un sous-espace vectoriel de H, donc de L‘, donc de 
/*. Or cela est impossible d’aprés un théoréme de Banach (1, chap. XII, §3, th. 
7). Banach énonce seulement que /’ n’est pas isomorphe a un sous-espace 
vectoriel de /* si 1 < p, g << + ©, p #q, mais le résultat reste valable si on 
admet que p et g puissent prendre la valeur 1. En effet, /' n'est pas isomorphe a 





160 A. GROTHENDIECK 


un sous-espace vectoriel de /? pour 1 < p < + @, puisque /” est réflexif et /' 
ne l’est pas; et /? n'est pas isomorphe a un sous-espace vectoriel de /', car comme 
nous avons rappelé plus haut, il contiendrait alors un sous-espace vectoriel 
isomorphe a /', ce qui est impossible. 

Notons en passant que le résultat auxiliaire utilisé dans la démonstration 
de (1) est commode pour réduire diverses questions relatives 4 des mesures 
non discrétes quelconques, aux questions analogues relatives 4 la mesure de 
Lebésgue sur (0, 1). On trouve par exemple immédiatement: Si u est un mesure 
non discréte sur l'espace localement compact M, alors il existe des parties 
de M non mesurables pour uv; on peut trouver dans L'(z) des suites qui conver- 
gent faiblement sans converger fortement; si u est bornée l'application identique 
de L”(u) dans L'(u) n’est pas compacte; la convergence d'une suite dans L?(z) 
(1 < p< + ©) n’implique pas sa convergence presque partout, ni méme 
qu'elle soit latticiellement bornée dans l'espace de toutes les classes de fonc- 
tions numériques partout finies mesurables (classes modulo I|’égalité localement 
presque partout); a fortiori la boule unité de L?(u) n'est pas latticiellement 
bornée dans l'’espace précédent, etc. I suffit en effet de vérifier ces affirmations 
dans l’espace L'(uo) construit sur la mesure de Lebésgue. Comme pour une 
mesure discréle yw, le contraire de chacun des énoncés précédents est valable, 
on obtient autant de conditions nécessaires et suffisantes pour qu’une mesure 
soit discréte. On en tire par exemple que pour un espace compact M, le fait que 
toute mesure sur M soit discréte est une propriété vectorielle-topologique de 
l’espace E = C(M) des fonctions continues sur M: elle signifie en effet que toute 
suite dans E’ qui converge pour o(E’, E’’), converge fortement, ou encore que 
toute partie de E’ compacte pour o(E’, E”) est fortement compacte. II est 
facile de s’assurer que si M est métrisable, la propriété envisagée signifie que M 
ne contient pas de partie fermée sans point isolé, i.e. que M est dénombrable. 


BIBLIOGRAPHIE 


1. S. Banach, Théorie des opérations linéaires (Varsovie, 1932). 

2. A. Grothendieck, Sur les applications linéaires faiblement compactes d’espaces du type C(K). 
Can. J. Math., 5 (1953), 129-173. 

3. A. Zygmund, Trigonometrical series (Varsovie, 1935). 


University of Sio Paulo 











VOLUME 8 


in the Mathematical Expositions Series 


Bernstein Polynomials 
BY G. LORENTZ 


THIS subject affords the author a fine opportunity to apply certain 
general concepts of Modern Analysis, such as that of the best ap- 
proximation, summability, and the moment problem,—to mention 
only a few. The relation to the Calculus of Probability is also 
discussed. 


G. Lorentz is Assistant Professor of Mathematics at University of 
Toronto. 


140 pages. 6 x 9 inches. $5.75 


MATHEMATICAL EXPOSITIONS SERIES 
THE purpose of this series is to provide books which emphasize 


fundamental principles in advanced mathematics while presenting 
the material in a readable manner. 


. THE FOUNDATIONS OF GEOMETRY 
By GrtsBert ve B. Rosinson. $4.00 
. NON-EUCLIDEAN GEOMETRY 
By H. S. M. Coxerer. $5.00 
. THE THEORY OF POTENTIAL AND SPHERICAL 
HARMONICS 
By Wo .canc J. STERNBERG AND TURNER L. Smiru. $5.50 
. THE VARIATIONAL PRINCIPLES OF MECHANICS 
By Cornetius Lanczos. $5.75 
. TENSOR CALCULUS 
By J. L. Synce anp A. Scuip. $6.50 
. THE THEORY OF FUNCTIONS OF A REAL VARIABLE 
By R. L. Jerrery. $6.00 
. GENERAL TOPOLOGY 
By W. Sierpinski. Translated and revised by C. C. Kriecen. 
$7.50 
. BERNSTEIN POLYNOMIALS 
By G. Lorentz. $5.75 


UNIVERSITY OF TORONTO PRESS 




















From 


CAMBRIDGE UNIVERSITY PRESS 


Ready January 
COMPLEX VARIABLE THEORY AND 
TRANSFORM CALCULUS WITH TECHNICAL APPLICATIONS 
By N. W. McLachlan 384 pp., 102 text-figures About $8.50 


This is the second edition, completely revised and brought up to date, of 
Complex Variable and Operational Calculus with Technical Applications first 
published in 1939. 


Newly Published 
INTRODUCTION TO DYNAMICS 
By L. A. Pars 524 pp., 230 text-figures $5.35 
A study of motion in two dimensions—particle, rigid body, system—Lagrange’s 
equations. Use of the methods of calculus is essential in the approach of the book. 
CAMBRIDGE ELEMENTARY STATISTICAL TABLES 


By D. V. Lindley and J. C. P. Miller 85 cents 


A compact, strongly bound booklet which gives the numerical material needed for 
teaching and using the basic techniques. 


COMPOUND INTEREST AND ANNUITIES-CERTAIN 
By D. W. A. Donald 308 pp. $3.40 


A practical book for the student with examples and exercises in each chapter. 
Answers are at the end of the book. 


LIFE AND OTHER CONTINGENCIES Volume I 
By P. F. Hooker and L. H. Longley-Cook 

320 pp., 13 text-figures $3.85 
Part of a new series published under the authority of the Institute of Actuaries. 


THE THEORY OF METALS 
By A. H. Wilson 354 pp., 43 tables, 2nd edn. $7.65 


A standard work that has been out of print since 1947. This new edition has been 
re-set and greatly enlarged to contain a selective account of the work done in 
the last 15 years. 


Available from 


THE MACMILLAN COMPANY OF CANADA 
LIMITED 


70 Bond Street Toronto 2 





























